ЖТ 


EDUCATIONAL лмо PSYCHOLOGICAL 


| MEASUREMENT 


ł 
- Volume VI м Ж 
1946 
$. 
* 
ai 
“=. 
g 
$" 
E" af 
E 
< р 
tye ” bí * 
вч è 
. E ë 
+ à "^ i 
^" n ` z 


3 К " А 
917 FIFTEENTH ST., ү, Wey ө WASHINGTON 5, D. C. 
; à Xd E А s 


рач: МА а E ы 
[хс 12. s ПОРЕЧ i 

NA ^E EDUCATIONAL AND PSYCHOLOGICAL 
AOS, и У E... { 


Frepertc Kuper, Editor 


ASSOCIATE EDITORS 
Donorny C. Apxins ...... United States Civil Service Commission 


Forrest A. KiNGSBURY ...... экмей аг S Yd . University of Chicago 
Екер McKinney, Editorial Representative of the American College 
Personnel Association ......... ....... University of Missouri 


M. W. RICHARDSON 


RENT United States Civil Service Commission 


BOARD OF COOPERATING EDITORS 


Joun G. DarLEY 
University of Minnesota 


Hanorp A. EDGERTON 
Ohio State University 


Max D. ENGELHART 
Chicago City Junior Colleges 


E. B. Greene 


United States Employment Service 


J. P. Guirronp 
University of Southern California 


E. Е. Linpgutst 
State University of Iowa 


Cuartes I. Mosier 
Office of the Secretary of War 


P. J. Ruton 


Harvard University 


Davin SEGEL 
U. S. Office of Education 


C. І. SuanTLE 
Ohio State University 


Н. C. TAYLOR А 
The W. Е. Upjohn Institute 
for Community Research 


THELMA С. Тновѕтоме: 
University of Chicago 
Hersert A. Toors 
Ohio State University 


Е. С. WILLIAMSON 
University of Minnesota 


Ben D. Woop 


Columbia University 


Jonn R. YALE 


Science Research Associates 


The journal 
testing programs bein, 


the measuréme 
methods of treating test data. 
hee without charge. 
treet, N. W., Washingto: D. C. 
sketch EN each та eod 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREME 


year, at N. Queen St. and 
ashington 5, D. C. 


quarterly, one volume per calendar 


Lancaster, Pa. and 917 Fifteenth St., NW. W. 
class matter October 8, 1945 at the Post Office at 
March 3, 1879. Copyright, 1946, by Frederic Kud 


Subscription rate $4.00 a year, d 
volumes: Volume V (1945), $5.00. 
print edition at $3.00 per volume (pa 

» 


r E used for various purposes, 
apessurement їп general or in specific fields and (4 
o ent field, such as suggestions of n 
Contributors recei 
- Manuscripts should be s 


is open to (1) reports of research on the development and use of 
ests and measurements in education, government, and industry, (2) descriptions of 

1: (3) discussions of problems of 
) miscellaneous notes pertinent 
ew types of items or improved 
ve one hundred reprints of their 
ent to Frederic Kuder, 917 15th 
1 , Writers are requested to include a biographical 
сеооа E following the style of the section on contributors pub- 
< 


er. 


NT is published 
McGovern Ave., 
ntered as second 

Lancaster, Pa., under the act of 


Single copies, $1.25, Back 
V are available in a small- 


a 


М ——Smáof. 


> 


* 


INDEX FOR VOLUME VI 


Adams, Clifford R. ít 
Tue PREDICTION OF ADJUSTMENT IN MARRIAGE .......... 185 
Adkins, Dorothy C. "N 
CONSTRUCTION AND ANALYsIS OF WRITTEN TESTS FOR PRE- 
DIcTING Јов PERFORMANCE ..... ee. "тта: 
Adkins, Dorothy C. (with Milton M. Mandell) — — 
Tue Улиріту ОЕ WRITTEN TESTS FOR THE SELECTION OF 
ADMINISTRATIVE PERSONNEL ......... eee eere nne 293 
Bailey, H. W. (with Irwin A. Berg and William M. Gilbert) 
COUNSELING AND THE Use or TESTS IN THE STUDENT PER- 
SONNEL BUREAU AT THE UNIVERSITY OF ILLINOIS ....... 37 
Banarer, Joseph (with D. Welty Lefever and Alice Van Boven) j 
RELATION or Tesr Scores TO AGE AND EDUCATION FOR 


ApULT ‘WORKERS cccsrrssa dessep se E PE ee 351 
Banarer, Joseph (with D. Welty Lefever and Alice Van Boven) 
VALIDATION STUDIES ON JOB INFORMATION TESTS ........ 223 


Bean, Kenneth L. А 
Tue DEVELOPMENT OF AN Емсілѕн Usace Test FoR 


© CLERKS, Typists, AND STENOGRAPHERS" .... Le esee Фа BOL 
Berg, Irwin A. (with H. W. Bailey and William M. Gilbert) 
COUNSELING AND THE Use or TESTS IN THE STUDENT PER- 
SONNEL BUREAU AT THE University oF ILLINOIS ...... 37 


A 
Berg, Irwin A. (with Graham Johnson and Robert P. Larsen) 
“Tur Use or AN OBJECTIVE Test IN Prepictinc RHETORIC 


SCORES ...e n ЕЕЕ 429 © 

Bixler, Ray Н. (with Virginia Н. Bixler) 

Test INTERPRETATION IN VOCATIONAL COUNSELING ..... . 145 
Bixler, Ray H. (with Edward S. Bordin) 

Test SELECTION: A Process OF COUNSELING ............ 361 
Bixler, Virginia H. (with Ray H. Bixler) — я 

TEST INTERPRETATION IN VOCATIONAL COUNSELING ...... 145 
Bordin, Edward S. (with Ray H. Bixler) 

Test SELECTION: A Process or COUNSELING ... T 361 


Bradley, Mary Edith TRA 
A STUDY or THE VALIDITY OF THE ARMED Forces InsrrTUTE 


TEsrs or GENERAL ÉDUCATIONAL DEVELOPMENT IN THE 
FIELD or SOCIAL STUDIES ....... o AMETE W.. 265 
Brogden, Hubert E. 
Tur Errect or Bias Due то DirricurTY FAcrons 1N Ркор- 
uct-Moment Ітем INTERCORRELATIONS ON THE Accu- 
RACY or ESTIMATION OF RELIABILITY BY THE KupER- 


RICHARDSON FORMULA NUMBER 20 .................. 517 
ii 


Chase, Wilton P. | 
MEASUREMENT OF АттїтирЕз Towarp COUNSELING 
Cronbach, Lee J. " 
Response Sets AND Test VALIDITY 
Donahue, Wilma T. _ "m 
University or Micutcan Norms кок THE UNITED STATES 
AnMED Forces Instirure Tests or Genera Epuca- 
TIONAL DEVELOPMENT 
* * 
Dysinger, Wendell S. 
Tue Use or Tests Ar MacMurray COLLEGE 
Feder, Daniel D. _ 
Tue Use or OBJECTIVE ACHIEVEMENT EXAMINATIONS IN A 
NavAL TRAINING PROGRAM 
Fensch, Edwin A. 


A Srupv or Рѕүсногосіслі Reports IN A ScuooL Sys- 
TEM 


HE EXPERIMENTAL EVALUATION oF A SELECTION Pro- 
CEDURE 


ATA REGARDING THE RELIABILITY AND VALIDITY OF THE 
AcADEMIC INTEREST INVENTORY Я, 
Guilford, J. Р. 


New STANDARDS ror Trst EVALUATION 
Harrell, Thomas W. 
RMY GENERAL CLASSIFICATION Test RESULTS ror Air 
А Forces SPECIALISTS 
Hershey, John O. 
Tue PRACTICAL ADAPTATION OF COUNSELING AND TESTING 
TO AN ĪNDUSTRIAL SCHOOL 
Hildreth, H. M. 
A SCALE For MEasurin 
БЕ BEBÉ 5.5 5 ы тер та emere errs 
Holzberg, Jules D. f 


Projective TECHNICS IN A NEUROPSYCHIATRIC Hosrrrar . 
Jenkins, William Leroy 


Quick METHOD For MuLTIPLE R AND PARTIAL R’s 
Jenkins, William Leroy 
SHort-Cur METH 
King, J oseph E. 
Tue Moopirication- 
MEASUREMENT 


G PsvcnorocicAL CHANGES DURING 


OD FOR c AND R 


Revision METHOD iN PsvcnoMoron 


261 


61 


213 


249 


445 


37 


375 
427 


341 


Lefever, D. Welty (with Joseph Banarer and Alice Van Boven) 
VALIDATION STUDIES ON Јов IxronMATION TESTS ........ 2 
Lewinski, Robert J. n 
Tue SHIPLEY-HARTFORD SCALE As AN InpEpENDENT MEAs- 
URE OF MENTAL ABILITY ............. ee RENI. À 2 
Mandell, Milton M. (with Dorothy C. Adkins) | 
lur VALIDITY or WRITTEN Tests FOR THE SELECTION ОЕ 


23 


53 


ADMINISTRATIVE PERSONNEL ..... Serene RES 
Я i 
Mosier, Charles I. "2 
RATING or TRAINING AND EXPERIENCE IN PuBLIC PERSON- 
NEE SEUBORION ысы лнн тынайдын сатыы — и ЗЭ 


Pallister, Helen 
PsvcuoLocicAL TEsTING iN RELATION TO EMPLOYEE Coun- 


BELING ....... si saa ova Sas ———— В аЬ ДЛО 


Кое, Аппе 


Tug PERSONALITY OF ARTISTS i22sees s uaosiasssasisee 401 


Rogers, Carl R. 


РѕүснометкІС TEsrs AND CLIENT-CENTERED CounsELING . 139 


Seymour, H. C. 


Tue COUNSELOR AND THE Нісн Scuoor Trstinc Procram 73 
Spache, George 
Usixc Tests IN A SMALL SCHOOL SYSTEM ...... —MEE | 
Staff, Advisement and Guidance Service, Veterans Administra- 
tion А 

Tue Use or Tests IN THE VETERANS ADMINISTRATION 

CouNSELING PROGRAM .............. Sep MA каа Б waa 17 
Stalnaker, John M. (with Ruth C. Stalnaker) 

Tur Errecr oN A CANDIDATE’s SCORE oF REPEATING THE 
SCHOLASTIC ÅPTITUDE TEST OF THE COLLEGE ENTRANCE 
EXAMINATION BOARD ................... зае оаа 298 

Stalnaker, Ruth C. (with John M. Stalnaker) 

Tue Errecr oN A CanpipaTe’s SCORE OF REPEATING THE 
SCHOLASTIC АртїтирЕ TEST or THE COLLEGE ENTRANCE 
Examination BOARD ueeossexeessenseeinreeeeseeeate 499 

Swanson, Donald E. | S 

Tur Ёоьк or TESTING IN STUDENT PERSONNEL SERVICES AT 

HamriNE UNIVERSITY. weeceeneenmmnt bsec ac ure eene 25 
Tayl 

or, Erwin K. 

Some SUGGESTIONS FOR THE ImproveMENT OF MACHINE- 91 
SCORING METHODS «6. m mme * 62 

Trax] А 
er, Arthur E. 
EVALUATION OF APTITUDE AND ACHIEVEMENT IN А Guip- à 
Ф ANCE PROGRAM „оао оаза cetera eei e ERE Ee desig 
Troyer, Maurice E. Р 
An Arrempr TO Improve THE COMPREHENSIVE EXAMINA- a 


TION AT THE Masrzn's LEVEL ..... +--+ +++ 
v 


ab 


Van Boven, Alice (with Joseph Banarer and D. Welty Lefever) 
RELATION oF Test Scores то AGE AND EDUCATION FOR 


Apunr WORHESB „зшен ыззат OR HAGE HOSOI 351 
Van Boven, Alice (with Joseph Banarer and D. Welty Lefever) 
VALIDATION STUDIES ON Јов INFORMATION TESTS ........ 223 
Wilson, Margaret H. 
Tue SrLr-APPRAISAL PmocRAM IN THE PHILADELPHIA 
Junior Нісн Ѕснооіѕ ....... -S PE A 81 
Wrenn, C. Gilbert 
CLIENT-CENTERED COUNSELING .......... nnn 439 


Zerfoss, Karl P. p 
A Note on THE DIAGNOSIS AND TREATMENT OF SCHOLAS- 


tic DIFFICULTIES ИННИИ 269 
> . 
А * 
LI 
> 
ы t 
Lg 
< 
B 
* 
e 
* * 
ЬЯ 
2 
Ф v 
СЯ 
* А 

E . s 
» R. , 

Lg 

» 4 


EDUCATIONAL and 
SY 


U. 

P 
i 

р, 


CHOLOGICAL 


VOLUME SIX, NUMBER ONE, SPRING 


Evaluation of Aptitude and Achievement in a Guidance Pro- 


the Student da ^ 


DYSINGER ........... ТРЕ ii... PANE 61 
The Counselor and the Н. igh School Testing Program. H.C. 
Seymour 7 


The Self-Appraisal Program in. the Philadelphia Junior High 
Schools. Marcaret Н. Witton f м 


ROGERS. аад CERTE И 1 
Test Interpretation in Vocational Counseling. Ray H. 
Віхіев and Verna Н. Голого ААО ла зе gs 145 
M Casurement News 2 


ЕЕС ln MM. ANM. 157 
е Con QUIT ERE TI V DRM Ж». .„.@ Е жк 159 
easurement Ab pL LIE MM EM 163 
Copyright, 1946, by Р 
Freperic Kuper М 


+ 


€ 


EVALUATION OF APTITUDE AND ACHIEVEMENT 
IN A GUIDANCE PROGRAM 


ARTHUR E. TRAXLER 


Educational Records Bureau InN Z 
Introduction 


One of the most important changes currently taking place 
in American schools is the transfer of the interests and efforts 
of teachers from subject matter to Students. It is the change. 
from the formal teaching of groups to the guidance of individual 

Oys and girls. This trend in educational philosophy and 
Practice had its beginnings about the time of the first World 

ar, it gathered impetus during the 1920's, and it expanded 
notably during the 1930's and early 1940s. It is still confined 
to the more enlightened schools and the better trained teachers, 
but there are hopeful signs that it may eventually spread to 
all elementary and secondary schools and even, in time, to all 
Colleges, 

As schools in increasing numbers undertake guidance pro- 
&rams it is becoming generally recognized that if teachers and 
Counselors are going to cooperate purposefully and effectively 
m the guidance of individuals, they must be provided with 

SPendable information about each individual, and they must 
e thoroughly informed concerning the meaning and uses of 

is information. A considerable portion o 
°rmation can be obtained by means 
@PPraising aptitude and achievement. 


Meaning of Aptitude and Achievement 


Well-defined thinking concerning the evalu 
€ and achievement requires, first of all, an 
ОЁ these two terms and the relationship betwee 
Sometimes thought that aptitude and achieveme 
3 


4 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


separate origins. Aptitudes are naively assumed to be inborn 
characteristics and achievements are regarded as the product 
of training, whereas the two simply represent different empha- 
ses upon native ability and training. One's aptitudes are one's 
potentialities for success in given areas, but these depend on 
both inborn characteristics and experience. It is not possible to 
separate the influences of heredity and environment upon apti- 
tude, nor would this kind of separation be of much practical 
importance in the prediction of success, even if it could be 
made. Similarly, one's achievement is the level of skill, knowl- 
edge, and understanding one has attained in a given field, and, 
as is true of aptitudes, this level depends upon a complex of 
inborn traits and experiences which do not yield themselves to 
precise analysis. 

Both the difference and the similarity between aptitude and 
achievement may perhaps be clarified by noting the procedures 
we use in attempting to make evaluations in each field. When 
evaluating aptitude we try to place the emphasis upon native 
capacity by posing problems in which the individual has had 
no formal training. When evaluating achievement we attempt 
to emphasize training by formulating tasks dealing with ma- 
terials similar to those he has studied or with which he has had 
experience. For example, we often base the evaluation of nu- 
merical aptitude partly upon a test of number series which, as 
a rule, is not taught in the mathematics curriculum, whereas in 
the evaluation of achievement in mathematics, one of the com- 
mon tests is concerned with speed and accuracy in computation, 
which is taught in the mathematics course. 

It is to be noted further that when we are dealing with apti- 
tude for a certain field or with achievement in a given area, We 
E шине not only with a combination of aptitude and 
minorem A eq egent Б ape ан 
TENA de ed mechanical aptitude, for instance; 
nae 1 vm of discrete aptitudes—space percep- 
à genuity, muscular dexterity, and so forth— 

and a variety of achievements—familiarity with mechanical 
аа um dera various tools, and other acquired 
instruments have recently been made 


EVALUATION OF APTITUDE AND ACHIEVEMENT 5 


available for the measurement of fairly pure “primary factors,” 
and that further developments of that kind are to be expected, 
but in no field of human endeavor is success dependent upon 
just one of these factors. 


?valuation of Aptitude 


Helpful information concerning aptitudes may be obtained 

Y means of observation and other nonstandardized procedures. 
Studies have shown, for example, that the school marks earned 
by high school pupils are one of the best criteria for the pre- 
diction of their success in college and that they are also related 
to vocational success. As measurement techniques have im- 
Proved, however, there has been an increasing tendency to base 
the evaluation of aptitude upon tests. For the appraisal of 
Certain kinds of aptitudes—scholastic aptitude, in particular— 
tests have almost entirely superseded uncontrolled observation. 

For purposes of guidance aptitudes are not independent 
entities. The only kind of aptitude which counselors, and those 
they advise, are interested in is aptitude for something. In the 
Appraisal of aptitude, therefore, the first question a counselor 
needs to ask is “Aptitude for what?" In other words, *Con- 
Cerning what kinds of aptitude will I need to have information 
in order to do an adequate job of guidance?" The answer de- 
Pends entirely upon the goals of the pupils being advised. 

hese will vary from school to school and from individual to 
individual, but nearly all will have some goals in common. 

€veral types of aptitude tests are useful in all schools and with 
Nearly all individuals. 

A scholastic aptitude test probably has broader and more 
Numerous uses than any other kind of test that a school coun- 
Selor can use. It has potential values for the Prediction of 
Success in every school subject and in many vocations, although 
its usefulness is much greater in certain fields of study than in 
Others, The usual kind of scholastic aptitude test has greatest 
Value for prognosis with respect to areas in which language or 
Verbalization is very important and is least helpful, in fore- 
esting success in fields where space relationships and motor 
Skills are predominant. The limitation of scholastic aptitude 


6 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


tests for prediction in the latter area can be offset by extending 
these tests to include a greater variety of items. 

The specificness with which scholastic aptitude is measured 
may vary all the way from a single over-all measurement to 
measurement of aptitude for each subject. Formerly many tests 
were prepared at both extremes of this scale. Thus, on the one 
hand, we had the development of many general intelligence 
tests yielding a single mental age and IQ, such as the Stanford- 
Binet Scale (33), the Otis Self-Administering Test of Mental 
Ability (20), and the Kuhlmann-Anderson Intelligence Test 
(12); and, on the other hand, there were made available various 
prognostic tests in the school subjects, such as the Symonds 
Foreign Language Prognosis Test (32), the Orleans Algebra 
Prognosis Test (19), and the Lee Test of Geometric Aptitude 
(13). The first type is still widely used and will no doubt con- 
tinue to have a place in guidance programs whenever a quick, 
general measurement of mental ability is needed for purposes 
of broad prediction of success in school, business, or the pro- 
fessions, even though such a test obscures differences in kinds 
of aptitudes within the individual. Experience with the second 
type has usually indicated that the predictive value of tests 
constructed for prognosis within a given subject matter area 1s 
little, if any, higher than that of the better tests of general 
scholastic aptitude. 

The present tendency is toward the construction and use of 
scholastic aptitude tests that fall between these two extremes. 
They are somewhat diagnostic; yet they do not attempt to 
provide prognostic scores for each subject field. Since studies 
have shown that in the academic fields—English, mathematics, 
science, social studies and languages—success depends in con- 
siderable degree upon varying combinations of linguistic apt! 
tude and quantative aptitude, the majority of the newer schol- 
astic aptitude tests yield separate scores in these two areas: 
In some of these tests, such as the American Council on Edu- 
cation Psychclogical Examination (34) and the California Tests 
of Mental Maturity (31), provision is made for combining these 
two scores into a gross score, if desired, while in others—for 
example, the College Entrance Examination Board Scholastic 


| 


EVALUATION OF APTITUDE AND ACHIEVEMENT 7 


Aptitude Test (7) and the Secondary Education Board Junior 
Scholastic Aptitude Test (25)—the scores are kept separate. 
With the improvement and better standardization of this 
type of scholastic aptitude test, and as research makes clearer 
the relationship of the two types of scores to success in the 
different fields of study, it may be expected that counselors 
will find less use for tests of general mental ability and very 
slight need for prognostic tests in each subject. The decreased 
demand for subject prognosis tests is evidenced by the fact 
that no new tests of this type have been published in the 
academic fields for some years. | 
For purposes of predicting success in the academic subjects, 
à test which provides verbal and numerical scores is a happy 
Compromise between the need for valid measurement of apti- 
tude and the desire to base the appraisal of aptitude upon a 
test which can be given and scored within a reasonable time. 
Better prediction in all the academic fields could be obtained 
by using a greater variety of tests, but the law of diminishing 
returns operates rather drastically when one goes beyond the 
Verbal and numerical factors, and the increased predictive 
value may not be worth the considerable additional outlay in 
time and expense. For the prediction of success in the fine 
and practical arts and in commercial subjects, however, and 
Or purposes of vocational guidance, to which schools are 
giving increased attention, other measures of aptitude are 
needed, 
These additional measures may be obtained in two ways. 
In the first place one may employ longer, more varied, and more 
lagnostic aptitude test batteries. Two noteworthy batteries 
9f this kind are the Chicago Tests of Primary Mental Abilities, 
evised by the Thurstones (35), and the Yale Educational 
Adtitude Tests, prepared by A. B. Crawford (8). The Chicago 
ests of Primary Mental Abilities are designed for ages 11 to 
17. They yield а profile for six factors: number, verbal mean- 
& space, word fluency, reasoning, and memory. Similar tests 
aig cing developed for the kindergarten and first grade and for 
the intermediate grades. The Yale Educational Aptitude Tests 
are intended for senior high-school students and college fresh- 


8 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


men. They consist of seven tests including verbal comprehen- 
sion, artificial language or linguistic facility, verbal reasoning, 
quantitative reasoning, mathematical aptitude, spatial е 
ualizing, and mechanical ingenuity. The guidance values o 
both batteries will become greater as soon as we know more 
about their relationship to different kinds of outcomes. Certain 
logical relationships are, however, obvious. It seems clear, for 
example, that the space factor in the Chicago battery and the 
spatial Visualizing and mechanical ingenuity tests in the Yale 
battery are related to mechanical aptitude. 

In the second place 
scholastic aptitude may 
aptitude in the arts, the 
and mechanical fields. 
Musical Talent (25), 
(16), the Minnesota 
or the Bennett Mech 
administered to indiy 


prehensive measurement, 
It is true that the skills used in 
closely similar and that tests for 


ministered 
No such test is a 


Se 


‘Pont 


EVALUATION OF APTITUDE AND ACHIEVEMENT 9 


with a variety of occupational keys, multiple-scoring tests of 
vocational interests should form an integral part of the evalu- 
ation techniques in every guidance program. The Strong Vo- 
cational Interest Blanks (30, 31), which can be scored with 
thirty-five occupational scales and a number of scales for oc- 
Cupational groups, the Kuder Preference Record (12), which 
can be scored with scales for nine broad fields, and certain other 
interest tests yield scores that compare favorably in reliability 
and consistency, over a period of years, with scores on the 
better aptitude and achievement tests. 

For purposes of evaluating the aptitudes of certain young 
People at the end of the secondary school and in college, coun- 
Selors should be aware of the help that may be obtained from 
tests constructed under the sponsorship of different professional 
Broups. Few of these tests are available for administration by 
high school and college counselors, but young people of high 
Beneral ability who have interests in specific professions may 

€ advised to ask the proper professional organization for per- 
Mission to take such tests. The Moss Scholastic Aptitude Test 
for Medical Students (17) has been used for years under the 
auspices of the Committee on Aptitude Tests for Medical Stu- 
ents. Information on the Yale Legal Aptitude Tests has been 
Published by Crawford and Gorham (9). The National 
cacher Examinations are administered by the Cooperative 
€st Service of the American Council on Education (24). A 
P re-Engineering Inventory prepared by K. W. Vaughn is ad- 
Ministered by the Measurement and Guidance Project in En- 
8neering Education (40). The American Institute of Ac- 
©ountants is carrying on an extensive project in the construction 
and evaluation of tests to select accountancy personnel (18). 
Skillful guidance of young people of high ability and character 
“toi the professions 4s ОЁ paramount importance not only for 
"he benefit of the individual but for the welfare of society as a 
ole, 
Evaluation of Achievement 


Although the appraisal of achievement has always been one 
“SPect of the process of educating and advising young people, 
1С 15 well known that this type of evaluation was almost entirely 


10 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


subjective until about thirty years ago. The first objective 
measurements of achievement were applied to facts and skills. 
Various other kinds of achievement were gradually attacked 
objectively. At present we still depend to some extent upon 
subjective methods, particularly in the appraisal of processes 
such as ability to do creative writing, but objective procedures 
are being applied successfully to several areas for which they 
were formerly thought to be unsuited. For instance, by means 
of a series of questions all centered upon a problem stated in 
paragraph form, it is possible to evaluate a pupil’s ability to 
draw logical inferences from a set of data or to generalize from 
specific facts. 

The breadth of measurement provided by modern achieve- 
ment tests and their potential worth as counseling instruments 
may perhaps best be shown by indicating the steps taken in the 
construction of one of these instruments. 

The usual achievement test consists of perhaps 100 to 200 
brief answer questions. At first glance it looks like the sort of 
thing that almost any teacher could make up. On the surface 
there is little evidence of the careful work that goes into the 
making of a good achievement test. 

The building of such a test 


involves at least twelve steps as 
follows: 


l. A survey of the aims or objectives in the subject for 
which the test is to be made through the use of text- 
books, courses of study, and questionnaires to schools. 
Selection of those purposes which are widely accepted 
and which can be measured objectively. 


A decision concerning the weight to be assigned to the 
different objectives. 


4. Preparation of test i 
objectives. 
5. The settin 
least 50 p 
form. 
Submission of the tr 
Administration of 


groups of pupils. 


tems bearing upon the various 


в up of a trial form of the test including at 
er cent more items than will be in the final 


Mos 


ial form to specialists for criticism- 
the experimental form to several 


EVALUATION OF APTITUDE AND ACHIEVEMENT 11 


8. A statistical analysis of the items in terms of difficulty 
and of validity as measured by a suitable criterion. 

9. Selection of the best items for the final form of the test 
on the basis of the comments of the critics and the item 
analysis. 

10. The scaling of the test on the basis of the performance 
of a defined criterion group so that it may be compared 
with other forms of the test and with tests in other 
fields, as, for example, the setting up of Scaled Scores 
for the Cooperative tests. 

11. The finding of norms for various grades or years of 
study. 

12. The formulation of precise directions for administering 
and scoring so that it will be possible for all persons 
giving the test and scoring it to obtain identical results. 

Thus the construction of a valid achievement test is a 
Painstaking and detailed process calling for the cooperation of 
many persons. When all of these steps are carefully followed 
by test makers, counselors may regard the resulting tests with 
considerable confidence. 

Numerous achievement tests are now available for nearly 
Every school subject. Although many of these were apparently 
Carelessly constructed and are so lacking the characteristics of 
а good test that they cannot be recommended, there is а variety 
ОЁ meritorious achievement tests at all levels from grade 1 to 
College. For the elementary school at least four comprehensive 
Achievement batteries are worthy of consideration. These are 
the Stanford (29), Metropolitan (19), Progressive (37), and 
Towa Basic Skills tests (28). The Stanford and Metropolitan 
tests sample the wider range of subjects, whereas the Progres- 
Sive and Iowa Basic Skills are somewhat the more diagnostic 
in those areas which they cover. 

At the secondary-school and junior-college levels, the Co- 
9Perative Achievement tests (6) have been used in schools and 
Colleges throughout the United States for a number of years, 

* Cooperative Test Service, a subsidiary of the American 
Council on Education, was set up early in the 1930's through a 


12 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


grant from the General Education Board. Under the direction 
of Ben D. Wood numerous forms of tests in practically all the 
academic subjects were constructed during a ten-year period. 
The many different tests in this series were coordinated and 
rendered comparable by means of a system of scaled scores de- 
vised by John C. Flanagan. Fora complete testing program in 
a public school these tests may need to be supplemented by 
achievement tests in commercial subjects, the practical arts, 
and the fine arts. 

The Iowa Tests of General Educational Development, con- 
structed under the direction of E. F. Lindquist (15), are an- 
other superior set of achievement tests. These tests, which were 
developed in the Iowa State Testing Program, are now available 
nationally, 

The United States Armed Forces Institute Tests of General 
Educational Development (39), prepared under the super- 
vision of Ralph W. Tyler and E. F. Lindquist, are the most 
recent achievement tests designed to cover the academic fields 
at the high school and college levels. "These tests consist of 
two general types: tests of general educational development 
and tests of achievement in particular subjects. Each test 
exists in two forms, one of which is restricted to use in the 
Armed Forces. The other form is available for civilian use and 
is distributed by the Cooperative Test Service and Science 
Research Associates, 

For use at the end of the college course, there is a compre- 
hensive achievement battery, the Graduate Record Éxamin- 
ation (11), which is subsidized by the Carnegie Foundation for 
the Advancement of Teaching. | 

Many persons working in the field of guidance feel that the 
next big step forward in achievement testing should be the 
development of a coordinated battery of comparable tests for 
successive levels from the intermediate grades to the end of 
college. Such a battery would greatly enhance the ease and 
confidence with which long-time growth records of the achieve- 


ment of individual pupils could be kept and used as a basis of 
intelligent counseling, 


EVALUATION OF APTITUDE AND ACHIEVEMENT 13 


Organizing and Using the Results of Aptitude and 
Achievement Tests 


The value of testing in a guidance program is almost wholly 
dependent upon the effectiveness with which the results are 
used by the faculty of the school. Intelligent and efficient use 
of the data calls for a definite plan of testing and of organizing 
and reporting the results and rendering them understandable 
to teachers, many of whom have had little training in psychol- 
Ову, measurement, or statistical procedures. It is highly de- 
Sirable for one member of the faculty to be given definite re- 
Sponsibility for this aspect of the guidance program and to be 
released from part of his teaching duties so that he may have 
Sufficient time for the detailed planning that is necessary. 

A school should adopt a systematic testing program con- 
Sisting of a battery of scholastic aptitude tests, achievement 
tests, and other tests to be administered annually or semi- 
annually to all pupils, and of special tests to be given to indi- 
Viduals in connection with problems involving diagnosis, 
Temedial work, and counseling. Procedures should be set up 
for the routinizing of the mechanics of the program in order 
to insure efficient and accurate scoring and the reporting of the 
Tesults to teachers and counselors in a form they can use. In 
this connection consideration should be given to the feasibility 
of Participation in state-wide testing programs, local coopera- 
tive programs centered around a scoring installation at an edu- 
ational institution which can serve several neighboring high 
Schools, or nationwide programs such as the one sponsored by 
the Educational Records Bureau for independent schools. 

The test data should be entered on cumulative record cards 
such a way that interrelationships between the scores and 
their relation to other kinds of information concerning the pupil 
Сап be noted easily. The organization of the cumulative record 
Orm should be such that the form is inherently a growth record. 
telpful suggestions can be obtained from the revised cumula- 
tive Tecord forms of the American Council on Education which 
аге published at four levels—primary grades, intermediate 

"ades, junior and senior high school years, and college. 


14 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


A continuous program of education in the use of tests and 
other techniques of evaluation is an essential feature of a gui- 
dance program. Among the basic materials for a teacher- 
training program are books on tests and other evaluating 
devices and their uses, such as Bingham’s Aptitudes and Apt- 
tude Testing (3), Remmers and Gage's Measurement ап 
Evaluation (22), Buros’ Mental Measurements Yearbooks 
(5), and the publications of the Cooperative Test Service, the 
Educational Records Bureau, the Iowa State Testing Programs 
and other test service agencies. Books on counseling procedures 
such as Williamson’s How to Counsel Students (41), Darley’s 
Testing and Counseling in the High School Guidance Progra™ 
(10), and Rogers’ Counseling and Psychotherapy (23) are also 
indispensable. 

So far as possible, the program of informing teachers con” 
cerning guidance techniques should be centered around the 
actual measurement and evaluation instruments which hav? 
been adopted for use in that particular school system. Bau 

cellent illustration of this approach to teacher education 1л 
guidance is furnished by a recent publication of the Schoo 
District of Philadelphia, The Self-Appraisal Program of Си 
dance in the Junior High Schools of Philadelphia: Handbook for 
Teachers (27). 
| Staff clinics or case conferences provide a further means of 
vitalizing and improving the training of teachers in technique 
of evaluating aptitudes, achievements, and other qualities a 
individual pupils (38). Guidance workshops, both those 1? 
connection with teacher-training institutions and those 5 "c 
by local school systems, serve a similar purpose. The guidane 
movement in the schools of the United States will be success! 
In direct Proportion to the degree in which these procedures 
sao For schools can do a thorough T 
Тшей е = teachers themselves have pec 
“carrying the ball.” am and have accepted respons! 
REFERENCES 


1. Жайка, E M. Minnesota Vocational Test for Cleric 
т. New York: Psychological Corporation, 1933-1 


3l work 
938. 


p 


17. M 


18. N 


19, 
20. 
21, 


EVALUATION OF APTITUDE AND ACHIEVEMENT 15 


Bennett, G. K. Mechanical Comprehension Test. New York: 
Psychological Corporation. 

Bingham, W. V. Aptitudes and Aptitude Testing. New York: 
Harper and Brothers, 1937. 

Buros, О. К. The Nineteen Thirty-eight Mental Measurements 
Yearbook. New Brunswick: Rutgers University Press, 
1938. 

Вигоѕ, О. К. The Nineteen-Forty Mental Measurements Year- 


book. Highland Park, N. J.: Mental Measurements Year- 
book, 1941. 


* Cooperative General Achievement Tests (Revised Series). New 


York: Cooperative Test Service, 1940. 


- College Entrance Examination Board Scholastic Aptitude Test. 


Princeton: College Entrance Examination Board. 


* Crawford, А. В. Yale Educational Aptitude Tests. New Haven: 


Department of Personnel Studies, Yale University. 


- Crawford, A. B. and Gorham, T. J. “The Yale Legal Aptitude 


Test,” Yale Law Journal, XLIX (1940), 1237-1240, 


- Darley, J. G. Testing and Counseling in the High School Gui- 


dance Program. Chicago: Science Research Associates, 
1943. 


© Graduate Record Examination. New York: Carnegie Founda- 


tion for the Advancement of Teaching. 


Kuder, G. F. Preference Record. Chicago: Science Research 
Associates, 1942. 


+ Kuhlman, Е. and Anderson, R. G. Kuhlman-Anderson Intel- 


ligence Tests. Minneapolis: Educational Test Bureau, 
1927-1939, 


- Lee, D. M. and Lee, J. M. Lee Test of Geometric Aptitude. 


Los Angeles: California Test Bureau, 1931. 


+ Lindquist, E. F. Iowa Tests of General Educational Develop- 


ment. Chicago: Science Research Associates. 
eier, C. and Seashore, C. E. Meier-Seashore Art Judgment 
Test. lowa City: Bureau of Educational Research and 
Service, State University of Iowa, 1929-1930, 
oss, F. A. "Scholastic Aptitude Tests for Medical Students," 
Journal of the American Association of Medical Co 
VI (1931), 1-16. 
issley, W. W. "Selection of Accounting Personnel," 
Presented at the Fifty-Seventh Annual Meet. 
American Institute of Accountants. New Y: 
Institute of Accountants, 1944, 
Orleans, J. S., Editor. Metropolitan Achievement Tests. Yon- 
kers-on-the-Hudson: World Book Company, 193 1-1937. 
Orleans, J. B. and Orleans, J. S. Orleans Algebra Prognosis Test 
. Yonkers-on-the-Hudson: World Book Company, 1928-1932. 
Otis, A. $. Otis Self-Administering Test of Mental Ability. 
Yonkers-on-the-Hudson: World Book Company, 1936-1939. 


lleges, 


D" Papers 
ing of the 
ork: American 


16 


22. 
23. 
24. 


25. 


41. 


- United States Armed Forces Institute Tests of General Educ 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Remmers, Н. Н. and Gage, N. L. Educational Measurement and 
Evaluation. New York: Harper and Brothers, 1943. 

Rogers, C. К. Counseling and Psychotherapy. Boston: Hough- 
ton-Mifflin, 1942. | : 

Ryans, D.G. “The Professional Examination of Teaching Can- 
didates: A Report of the First Annual Administration of the 
National Teacher Examination.” School and Society, LI 
(1940), 273-284. 

Seashore, C. E., Lewis, D. and Saetveit, Ј.С. Seashore Measures 
of Musical Talent. Camden: R.C.A. Manufacturing Com- 
pany, Inc., 1919-1939. 


- Secondary Education Board Junior Scholastic Aptitude Test. 


Milton, Mass.: Secondary Education Board. 


- Self-Appraisal Program of Guidance in the Junior High Schools 


of Philadelphia: Handbook for Teachers. Philadelphia: 
School District of Philadelphia, Board of Education, 1944 


- Spitzer, H. Е. et al. Jowa Every-Pupil Tests of Basic Skills. 


Boston: Houghton-Mifllin, 1940. 


- Stanford Achievement Tests.  Yonkers-on-the-Hudson: World 


Book Company. 


- Strong, E. K. Vocational Interest Blank for Men. Stanford 


University: Stanford University Press, 1927-1938. 


- Strong, E. К. Vocational Interest Blank for Women. Stanford 


University: Stanford University Press, 1933-1938. 


- Sullivan, E. T., Clark, W. W. and Tiegs, E. W. California Test 


of Mental Maturity. Los Angeles: California Test Bureau 
1936-1939. 


- Symonds, P. M. Foreign Language Prognosis Test. New York: 


Bureau of Publications, Teachers College, Columbia Unt 

versity, 1930. Tu 
Terman, L. M. and Merrill, M. A. Stanford-Binet Tests of 1% 

telligence (Revised). Boston: Houghton-Mifllin, 1937. 


* À il on 
- Thurstone, L. L. and Thurstone, T. G. American Council 0 


Education Psychological Examination for High School Ta 
dents, Form 1944. Washington: American Council on 
cation, 1944, ЛЕ 
hurstone, L. L. Primary Mental Abilities. Chicago: Unive 
. Sity of Chicago Press, 1938. T 
Tiegs, E. W. and Clark, W. W. Progressive Achievement Tes 
Los Angeles: California Test Bureau, 1933-1938. 


. Traxler, А. E. Techniques of Guidance. New York: Harpe 


and Brothers, 1945. Chapters 10 and 14. car 


; s ce 
tional Development. New York: Cooperative Test Serv! 


of the American Council on Education, 1945. ; 
Vaughn, К. W. "The Measurement and Guidance Project к 
Engineering Education.” Journal of Engineering Edut 
, tion, XXXIV (1944), 516-520. k: 
Williamson, E. G. How to Counsel Students. New Yor’ 
McGraw-Hill Book Company, 1939. 


THE USE OF TESTS IN THE VETERANS ADMINIS- 
TRATION COUNSELING PROGRAM? 
Staff, Advisement and Guidance Service, Veterans Administration, 
Washington, D. C. 

Tue need for psychological tests in the Veterans Adminis- 
tration’s counseling program is recognized in the vocational 
rehabilitation provision of Public Law 16, 78th Congress, and 
in the educational provisions of Public Law 346, 78th Congress. 
Since these provisions are concerned with the veteran's voca- 
tional and educational adjustment, the Veterans Administra- 
tion adheres to the policy, that to accomplish such adjustment, 
accurate information must be provided not only on vocational 
and educational standards and opportunities, but also on the 
abilities, aptitudes, interests, and other personality traits of 
the veteran. To obtain the latter type of information a com- 
Prehensive testing program is considered to be indispensable. 

As employed in the Veterans Administration tests consti- 
tute one of the important sources of information in the com- 
Prehensive description of the individual for guidance purposes, 

hey owe their place in the counseling procedure to their quan- 
titative and relatively objective character and, when used in 
conjunction with such other data as school and employment 
records, summaries of interviews, ratings, military records, 
and case histories, round out the picture of the individual. 

In effective counseling, tests do perform their function alone 

"t are always presented in the framework of the life pattern 
9f the individual. The importance of the framework or con- 
text cannot be over-emphasized. It is only when a particular 
. article was prepared by Central Office staff members of the Advisement 
cluded dance Service of the Veterans д аанын, ы i up ie. m 
Regional o get omitted by Dr. БШ a and pus other parts of the manuscri i 
are Based Gu cant Baltimore, Mary АМ. L. К. Harmon of the Regional Office T 

on material submitted by 


їп; £ 
neapolis, Minnesota. 
17 


18 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


test score is related to other scores on a comparable basis, or 
to such other factors as age, sex, education, work experience, 
physique, vocational plans, and personal ratings, that it comes 
to have its greatest significance in counseling. 

Ideally, such comparison would be made on a quantitative 
basis, but objective methods for handling patterned case infor- 
mation are not as yet available. In the absence of quantitative 
procedures, qualitative Judgments are made by counselors 
based upon their clinical experience. Test scores аге fitted 
into the broad pattern of traits, experiences, values, and drives 
which characterize the counselee. In some instances the coun- 
selor, prior to testing, will have formed tentative judgments of 
the individual’s personality from records and interview infor- 
mation. Subsequent analysis of test scores may refute, alter, 
or confirm these estimates. In other cases, observation of the 
test profile prior to the personal interview will raise questions 
which will have to be answered during the interview or from 4 
review of the available records of the counselee. This inter- 
dependence, this knitting together of personnel data, is the 
essence of sound vocational diagnosis. 

The way in which test information is correlated with other 
personnel information obtained from records and from the 
interview is illustrated by the following abbreviated case history 
of a veteran: 

Case of John Doe—Age 30, Married, Four children, Dis- 
ability: Neurosis 30%. 

John Doe came for advisement after he was discharged 
from a veterans hospital. He appeared to be quite discourage 
and depressed about his inability to adjust to civilian life- 
After his discharge from the Navy, he had returned to his рге” 
service employment as a turret lathe operator in one of the 
governmental agencies. He had retained his skill as a lathe 
operator but his work aggravated his disability and he was sub- 
ject to frequent periods of unconsciousness. On the two 0€ 
casions when he entered the hospital he had been express 
cautioned about returning to this kind of employment. І 

The veteran was highly motivated to return to productiv® 
employment because of his family situation. His experienc? 


TESTS IN VETERANS COUNSELING PROGRAM 19 


and expressed interests definitely pointed to the mechanical 
field and his successful employment as a lathe operator indi- 
cated training along mechanical or related lines. However, his 
disability was such that work in this field was contraindicated 
because of the possibilities of noisy and crowded conditions 
concerning which the veteran protested at length. Following 
the initial conferences, an interest test, a mental ability test, 
and a general achievement test were recommended by the 
counselor. 

The test results indicated that the veteran had an intel- 
ligence quotient of 105, that he was well above the average for 
his level in arithmetic computation, and that his interests were 
similar to persons engaged in agricultural occupations. 

Further interviewing elicited the fact that the veteran had 
an intense desire to live under the comparative quiet, outdoor 
conditions of agricultural life. He had refrained from men- 
tioning agriculture previously because he had established a 
home in the city and his wife preferred not to live in the country. 

A program was arranged whereby it would be possible for 
him to take a short, intensive course in dairy testing at a uni- 
versity, to be followed by a further period of training on the job. 
He entered training and successfully completed the course. He 
is now employed by a number of dairies in a job which permits 
him to travel in agricultural areas under favorable conditions. 
His income provides for his needs and he has experienced no 
lapse into unconsciousness since the training for his new job 
Was initiated. 

From the foregoing discussion it may be inferred that the 
Same principles apply to the use of tests in the Veterans Ad- 
Ministration’s program as in other counseling situations. And, 
to a large extent, this is true. However, there are certain 
Problems peculiar to the Veterans Administration’s program. 
For example, a very high proportion of the veterans of World 

ar П are eligible for training under either Public Law 16, 
78th Congress, or Public Law 346, 78th Congress, or both. It 
18 apparent at this time that a very large number of veterans 
will claim benefits under these laws. As a consequence, the 

€terans Administration will be obliged to render counseling 


20 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


services to what amounts to a virtual cross-section of the male 
population of the United States between the ages of eighteen 
and forty, to say nothing of the female veterans eligible under 
these laws. This great range of abilities, interests, aptitudes, 
values, social adjustments, occupational backgrounds, and edu- 
cational achievements will often confront even the most skilled 
counselor with problems of test interpretation beyond the 
boundaries of his experience. 

Many veterans in their late twenties and early thirties will 
be returning to school after three to five years in the service. 
The question arises as to whether these men should be com- 
pared with typical populations in the schools today, the mem- 
bers of which are many years younger and have attended school 
continuously. Also the norms on most tests have been derived 
from comparatively small samples the representativeness of 
which it is often difficult to evaluate. . 

The veteran often comes to the counselor in a frame of mind 
which is not conducive to the establishment of satisfactory 
counselor-counselee rapport. In the main, veterans are adults 
accustomed to making their own decisions and they frequently 
arrive at the counselor's office suspecting that they are to be 
told rather than counseled. They have been “talked at 
rather than “talked to” by their friends, the press, and the radio, 
and the amount of miscellaneous, conflicting advice and in- 
formation to which they have been subjected is sometimes very 
great. Consequently, many veterans are in no mood to liste? 
to advice and suggestions based on what they regard to be 
personal impressions derived wholly from case data. Objective 
tests are particularly effective in meeting this situation in that 
they appear to be divorced from the subjective opinion of the 
counselor, and hence become potent factors in enabling the 
veteran to make wise decisions regarding his vocational an 
educational objectives, This is particularly true when ob- 
Jectives are inconsistent with ability and demonstrated achieve 
Шеш Test norms, by reason of their objectivity, often speak 
more convincingly to the veteran concerning what he can do 
and what he cannot do than does the counselor himself. Test 
data tend to have a Very desirable effect in that they help the 


TESTS IN VETERANS COUNSELING PROGRAM 21 


individual accept his limitations in one field and seek the ful- 
fillment of his ambitions in another. The achievement of such 
results naturally depends upon how effectively the counselor 
presents the test data to the veteran. 

Again, some veterans who apply for vocational training and 
education may be subject to personal or social maladjustments 
which have been precipitated by the transition from military 
to civilian life and therefore do not appear on their records. 
These are sometimes spotted by the counselor in interviews or 
may be discovered during the administration of personal ad- 
justment tests. Such inventories provide an excellent means 
of discretely calling the attention of the veteran to the intimate 
Connection between his social and emotional maladjustments 
and his educational and occupational failures. 

Tests to be used in so comprehensive a counseling program 
as the one which the Veterans Administration has undertaken 
have to be selected insofar as possible on the basis of their 
&eneral applicability, a procedure restricting the number of 
tests which can be used. Furthermore, a veteran frequently 
receives his initial counseling at a guidance center consider- 
ably removed from the place where he is to take his vocational 
training or education, and the transfer of records in such cases 
is facilitated if the number of tests administered at the place 
9f initial counseling is not greater than is necessary to meet the 
Deeds of the individual case. 

The following list indicates some of the tests being used 
Most frequently in the counseling program. It is not an ex- 

austive list. The guidance centers maintain supplies of addi- 
Чопа] tests which are used when appropriate. 


General Ability Tests: : 
A.C.E. Psychological Examination for College Freshmen 
Ohio State University Psychological Test 
Otis Quick-Scoring Mental Ability Tests. Gamma Test: 


Form AM 
Revised Army Alpha Examination. Form 8 (Bregman) 
echsler-Bellevue Intelligence Scale Ea 


tanford-Binet (Terman-Merrill Revision) — ү 


22 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Achievement Tests: 
О. 5. Armed Forces Institute Tests of General Educational 
Development 
Cooperative Achievement Tests 
Stanford Achievement Tests 
Iowa High School Content Examination 
Mechanical Aptitude Tests: 
Tests of Mechanical Comprehension (Bennett) 
Revised Minnesota Paper Form Board Test 
Minnesota Spatial Relations Test 
Dexterity Tests: 
O'Connor Finger Dexterity Test 
O'Connor Tweezer Dexterity Test 
Minnesota Rate of Manipulation Test 
Purdue Pegboard Test 
Clerical Aptitude Tests: 
Minnesota Vocational Test for Clerical Workers 
Interest Inventories: 
Kuder Preference Record 
Vocational Interest Blank for Men (Strong) 
Vocational Interest Blank for Women (Strong) 
Personality Inventories: 
Adjustment Inventory—Adult Form (Bell) 
Minnesota Multiphasic Personality Inventory 


Trade Tests: 
Oral Trade Questions (U.S.ES.) 


It is felt that these tests are representative of the best 
available. In order that counselors may have a wide latitude 
of choice in prescribing test batteries and verifying test results 
for an individual claimant, and because frequently more than 
one test in a particular field are necessary to measure various 
traits within that field, several measures have been included 
in some of the fields. Various other tests are also used as c1- 
cumstances require to measure interest, aptitudes, and abilities 
for professions and trades, Moreover, it is to be expected that 
use will also be made of additional new measuring devices 25 
they become available and are found suitable. | 


TESTS IN VETERANS COUNSELING PROGRAM 23 


Summary 


Psychological tests are employed in the Veterans Adminis- 
tration’s counseling program to provide quantitative and ob- 
Jective information on the personal traits and characteristics 
of the veteran. Test information is not used alone, but in con- 
junction with personnel records, interviews, ratings, and case 
histories. A test score is of greatest value in counseling when 
it is related to other measures on a comparable basis, and to 
the experiences, desires, and achievements of the counselee. 
When available, quantitative methods are used for these com- 
Parisons, but in the absence of such techniques, the counselor 
must rely on his clinical judgment. Special counseling prob- 
lems arise in the Veterans Administration's program because 
of the wide range of abilities and interests of veterans and the 
difficulty of obtaining appropriate test norms. A selected list 
of well-known psychological tests is available to counselors and 
Dew tests will be provided as they become available. 


THE ROLE OF TESTING IN STUDENT PERSONNEL 
SERVICES AT HAMLINE UNIVERSITY 


DONALD E. SWANSON 
Hamline University 

Tue diverse functions of testing in student personnel ser- 
vices at the college level have been treated adequately in the 
literature. Descriptions of how tests have been put to work in 
implementing personnel services are less well publicized. An 
analysis of how testing functions are integrated and coordinated 
and a description of the uses of tests in a single program would 
appear to be desirable. Such descriptions would allow compari- 
Sons among the various colleges which are in the process of 
improving or developing functioning personnel programs. 

The purpose of this article is to show how tests have been 
used and are being used in the development of student person- 


After many years of psychological testing in the colleges 
of this country it is axiomatic that certain data about students 
can be obtained most advantageously by an adequate and 
Systematic testing program. It was hoped that testing data 
along with other diagnostic devices would promote a more 
Complete understanding of the students in our institution, Our 
eXperience with the program in the past decade has shown that 
Students counseled on the basis o£ objective test interpretation 
Jave gained clearer insights into their abilities, achievement, 


1 Hamli iversity is a co-educational College of Liberal Arts with a Sch 1 
Soire Anta aid the Tints bes School of Nursing. The enrollment is 900. 
Students, 


25 


26 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


interests, aptitudes, personality traits and attitudes as related 
to sound educational and vocational planning. Furthermore 
counselors and teachers have come to depend upon objective 
test evidence for obtaining realistic knowledge about the stu- 
dent before the student is counseled or taught. Many faculty 
members have used test data as a basis for guiding the learning 
processes and growth of students in formal as well as in informal 
campus activities. On the whole testing has served as а 
catalytic agent in implementing student personnel services and 
in promoting a closer relationship between students and faculty- 

But, in addition to the above dominant role of testing, 4$ 
the testing program has developed its tentacles have extended 
into the many interrelated aspects of the educational fabric. 
Test results have affected the teaching function, the admissions 
policy, and the quality of the student body as a whole as well 
as the behavior and adjustment of individual students who 
have been counseled. 

In short, a systematic testing program can yield the 
background of knowledge about the student which, in turn, 
may provide the foundation for more efficient counseling, for 
improved teaching and educational practices, and for institu- 
tional self-appraisal. 

We shall now show concretely how tests are put to work at 
Hamline University in each of the following areas: (1) РІ 
admissions and admissions counseling, (2) general counseling 
program, (3) instructional practices, and (4) research а" 
evaluation. 


core 


Tests in Pre-admissions and Admissions Counseling 


Each new student is required to take a rather extens! А 
battery of tests prior to or after being admitted to the colles® 
Students are informed that participation in the testing PT 
gram will enable them to know themselves better and 4 
help their counselors to know more about them so as PE 
them in planning wisely a program of study and an approp”? 
career. These testing services are offered to all students W % 
out charge and data are secured concerning each individu? 
vocational interests, Personality adjustment, reading abilit? 


TESTING IN STUDENT PERSONNEL SERVICES 27 


scholastic aptitude and achievement in various academic areas. 
These data along with other information become the basis for 
the Counseling service which is provided for the student. 

The “drag-net” battery of entrance tests which serves as 
a basis for admissions counseling and for setting up the general 
counseling program includes: the Strong Vocational Interest 
Blank, the Minnesota Personality Scale, the Iowa Silent Read- 
ing Test, the Cooperative Achievement Tests of General Pro- 
ficiency in the Fields of Social Studies and Natural Sciences, 
the American Council on Education Psychological Examination 
and the Cooperative English Test. The latter two tests are 
Biven to seniors in high school in the Minnesota state-wide 
testing program and the results are available for pre-admission 
Counseling. The Moss Nursing Aptitude Test is added to the 
above battery for prospective students in nursing, 

The tests mentioned above are administered on specified 
dates during the summer testing program, at the beginning of 
either semester, or they are given at the convenience of the 
individual, For the past five summers prospective freshmen 

ave been encouraged to take this sequence of tests in advance 
of Tegistration at one of the two or three testing periods an- 
nounced for this service, Students advise the Office of Admis- 
Sions by letter of the date on which they desire to take the 
tests, Overnight accommodations are provided on the campus 
°F out-of-town applicants. 

Students who have shown an interest in our college have 

еп very quick to respond favorably to this venture. In 
SPite of transportation difficulties 163 and 156 students, respec- 
tively, were tested on two separate days during the past two 

Summers, It has been our experience that most of the able 

Students in this group later register at Hamline University. It 

18 Probable that they would have sought to enroll anyway but 

1.55 obvious that some were aided in deciding whether or not 

they Could profit by attending this college. The chief value 
ч E the Summer testing plan, however, lies in the fact that com- 
Piete entrance testing data on these students are available 

earlier than data on those who take tests during freshman week. 

"sequently those who take tests during the summer have an 


28 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


opportunity for more extensive vocational and educational 
counseling prior to registration in the fall. They are invited to 
interview their counselors in advance of registration and are 
informed of the counselor's office hours. А clinical data folder 
for each counselee includes profiles of all test results including 
vocational interest test and personality scale data which are 
used as aids in educational and vocational planning. Students 
who take entrance tests in the fall just prior to registration do 
not have the advantage of having their vocational interest 
test and personality scale data included in the counselor’s profile 
at the time of registration because of the scoring hurdle. In 
most cases this lack represents a distinct handicap in the effi- 
cacy of counseling. Another advantage of summer testing 15 
that it relieves the scoring congestion in the fall which is 50 
frequently associated with entrance testing programs.’ 

A small fee to cover the cost of scoring the tests is charged 
for the summer testing program but the fee is refunded to al 
who enroll later. The administration has taken the attitude 
that a progressive college owes the best kind of student person- 
nel service to its clientele. Testing represents ап object!V® 
approach to student analysis and understanding which students 
are beginning to expect and can well demand. 

The policy of giving a rather extensive battery of tests to 
every new student at as early a time as possible and making t M i 
most of the information to help in registration planning seems 
to be more adequate than the older policies of sending ther 
into the personnel office for tests and help when problems h 
arisen or merely announcing that testing services are availab s 
The former plan is a frontal attack on the problem and favor. 
miren det glance should be primarly prev 
bie ener duh p EE шк Ап ideal plan for cte 
lenitatis of ere m that all students be given a soe 20 
registration. But ee counseling b eut adm nti 

` such a plan will be difficult to realize Ч" -j 


Я E e 
college clientele is made more aware of scientific perso?" 

has 
testi 


ave 


? We are fortun 
А ate tha iversi В : au 
offered its machine scori t the University of Minnesota Counseling Bure 


which are given on a еза X to us at scheduled times for some © up 
years ago between the Director of A OP eTative arrangement was worked out ollege? 
© irector of the Counseling Bureau and four St. Раш c^ 


TESTING IN STUDENT PERSONNEL SERVICES 29 


Procedures and until the present shortage of trained personnel 
staff is alleviated. It will remain for a few colleges to pioneer 
in this scientific project. One can hope that the time is not 
far off when students will demand that college counselors know 
something about them and relate that information to their 
goals and purposes instead of prescribing courses with “shot 
gun” methodology. 

Those students who do not meet the preliminary standards 
of admission to Hamline University are given further tests so 
as to help objectify our own supplementary criteria for admis- 
Sion. There are individual students who for one reason or an- 
other failed to render a true account of themselves in their 
high school records or in the scholastic aptitude tests taken in 
the high school. Such individuals are encouraged to take addi- 
tional tests of mental ability and proficiency to further evaluate 
their potentialities of competing at the college level. For this 
Purpose the Ohio Psychological Examination or another form 
of the American Council on Education Psychological Ex- 
amination are administered individually at the time of need 
in the offices of admissions or of student personnel. In addi- 
tion the General Educational Development Tests are given 
9ccasionally, especially to returned veterans. We have found 
it advisable in a few instances to include the Minnesota Multi- 
Phasic Personality Inventory in our pre-admissions battery. 

For several years Hamline University has admitted, upon 
Psychological testing, a few applicants who have not been 
8raduated from high school. Students have been enrolled also 
Without strict adherence to the traditional pattern require- 
ments if they possessed high scholastic potentialities. During 
the war emergency non-high-school graduates were admitted to 

amline University provided it was revealed by tests that they 
Were capable of carrying work at the college level and pro- 
Vided that they were recommended for such admission by their 

Igh school principals. 
turned veterans are given the same consideration that is 
ded to other candidates for admission and counseling. 
€ veteran who has taken a battery of tests at the Veterans 
ministration is asked to request from them a profile of his 


exten 


30 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


test results and it is not necessary for him to repeat similar tests 
at our institution. We do not require the veteran to take the 
General Educational Development Tests but many respond 
positively to the invitation to do so. 


Tests in the General Counseling Program 


College students need varying amounts of assistance in 
making adequate and satisfying adjustments to the responsi- 
bilities and opportunities confronting them. The college desires | 
that each individual should learn to function at his highest | 
capacity in the many aspects of the growth process. To this 
end the counseling program at Hamline University is main- 
tained to discover and fulfill the individual needs and interest 
of every student in the college community. Fifteen genera 
counselors collaborate with the director of student personne 
and the deans of the college in providing this service for junior 
division students. At the end of his sophomore year, the stu? 
dent selects a field of concentration for his last two years 0 
study and in so doing sclects his senior adviser. ‘This senior 
adviser directs his program of study, encourages him i" 
scholarly attainments, and assists him in maximizing the ейи 
cational opportunities that are placed before him. 

In the counseling program for freshmen and sophomores ar 
attempt is made to use test data along with other devices ! 


Й + : : z n 
the scientific counseling of students. A scientific counsel pe 
n 


program needs a sound testing program to support it. 3 
ou 


counselor-counselee relationship insights which follow $ 
test interpretations permit the student to arrive at a crystal d 
judgment of the significance of the test results for his futur 
growth and adjustment. Test information has been PUt 
work by counselors at Hamline University in charting the € 

tinuity of personal and social growth, in encouraging s¢lf-c° 


Petition, and in motivating gifted students and ander achir] | 
Test data have made a contribution to the counselor in pd 
ment of the student at the appropriate level in the curricul in 

нв prediction of potential success in new academic ventures 0* 
appraisal of student achievement, and in identification ? 


8 
à x of 
tential strengths or subject matter deficiencies. Counsel 


TESTING IN STUDENT PERSONNEL SERVICES 31 


have been aided by test evidence in diagnosis of reading de- 
ficiencies, immature study skills and habits, and adjustment 
Problems. Tests also help the counselor in recommending an 
increased or decreased student load, in assisting the student in 
Vocational planning or confirmation of a goal already chosen, 
and in recommending substitute goals for failing students. A 
Profile summary of test data can be used as a point of departure 
for an interview which is prompted by a request for an “inter- 
Pretation of those vocational tests we took,” or “I want to know 
More about my personality.” 

The general counselors have efficiently geared their coun- 
Seling efforts into the administrative machinery of registration 
and pre-registration planning for the next academic year. Pre- 
Tegistration week each spring affords a rich opportunity for the 
Beneral counselor to help the student to harmonize broad edu- 
Cational and vocational goals with the immediate goal of 
thinking through a satisfactory tentative program for the next 
€ducational step. At this time the counselors are informed 
that the student can expect from his counselor an interpretation 
9r review of the results of the vocational interest test and other 
tests taken up to date as related to current and future educa- 
Чопа] plans. Each sophomore and his counselor are given 
Separate profiles of the student's status on the Sophomore Cul- 
ture Test with both national and local norms. Such objective 
information has been found useful as a means of encouraging 
° challenging the student. 

ecently we have been experimenting with a routine for 
®aling with the student of low scholarship and with those on 
Probation. Each counselor is responsible for arranging an inter- 
View with such counselees and for reporting back to the Di- 
“ector of Student Personnel their status and prognosis. These 
Case Teports are turned over to the various deans who may 
“Site to counsel certain students. 

. More extensive sequence of tests than the entrance battery 
18 given to individual students who have special needs or in- 
oe In vocational guidance testing no set battery is pre- 
and ed but certain well-known aptitude, interest, achievement, 

Personality tests are selected from the testing files to meet 


32 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the peculiar needs of the individual. The Minnesota Multi- 
phasic Personality Inventory is given occasionally and is a 
distinct contribution to the clinical analysis of certain kinds 
of adjustment problems. The pooled clinical data of multi- 
phasic scores, interview and case history are used as a point of 
departure for psychiatric referral. 

In many institutions of higher learning counseling is now 
considered a normal and expected function in the teacher's 
responsibilities. The faculty counselors at our institution have 
varying amounts of training and experience in testing and 
other personnel procedures. A nuclear group of some fifteen 
general counselors has shown a great deal of interest in learning 
about and discussing the relationship of the implications of test 
results to the counseling process. For this purpose an in-service 
training program for faculty counselors has been set up. Oc- 
casionally talent from the outside is imported. Last fall the 
president of a manufacturing concern led a faculty discussio? 
on vocational testing in industry and its potential relationship 
to testing and counseling of college students. 

These in-service training sessions usually take the form 
a staff-clinic at which mutual needs and problems of counseling 
and test interpretation are discussed. The case study approac Г 
is often used and relevant test data are depicted on a speci? 
blackboard psychograph as a background for the presentation F 
The counselors last year expressed a need for readily availa 
materials on testing and the counseling process and the 2 k 
ministration furnished each counselor with a set of six current 
manuals or books to add to his present library in this field. 


n of 


Tests in Instructional Practices 


z TE" Ө ў 55, 
Р Testing also plays a significant role in the teaching proc" 
Ls test information is useful for both instructional and ew. 
seung purposes. Improvement of certain instructional practi " 
15 inextricably in 


tertwined with and affects the counseling § 


3 The material i rein 
tion of the Strong роуа were John С. Darley, Clinical Aspects and [иа И ! 


therapy; Fred McKinney, The Program; Carl R. Rogers, Counseling an 


Triggs, Improve You Репа; Franc fof 
› T м : E 
the Returning Сы and the National Research Council, Рус joloé 


E 


TESTING IN STUDENT PERSONNEL SERVICES 33 


tem. A systematic testing program can afford a background 
of information which permits faculty members to adapt teach- 
ing methods to specific needs and abilities of the student, to 
guide the student in the process and progress of learning, and 
to waive certain prescribed preliminary course requirements for 
Students who can demonstrate a high level of competence. 
ome of our faculty members use test data to aid them in 
following the above practices. | 
comprehensive testing program can allow the college to 
determine the level in the college at which the student can 
Compete most successfully and happily, and to consider certain 
Students for acceleration or accreditation on the basis of 
Proficiency, 
In looking to the future the Faculty of Hamline University 
need to think seriously of the above possibilities. A long 
Tange experience with tests helps to pave the way for progres- 
Sive developments which tend to lead gradually away from too 
Much course regimentation and the sole use of the formal 
Credit system, : 

Several departments in the college have used advantage- 
ously standardized achievement and proficiency tests at the 
end of a course to aid in the evaluation of the outcomes of in- 
Struction. Such tests are sometimes used as a partial substi- 
tute for the conventional final examination and the students 
are counseled regarding their status on national as well as local 
Norms. Final grades in the second semester course in Fresh- 
man English are withheld until the requirements of the stand- 
ardized English Achievement Test are met. A standardized 
test in English is given also to juniors to encourage continuous 
'MProvement in some of the communication skills. Those who 
fall below the level expected of sophomores on the latter test 
i уеп remedial work and are not graduated until they have 
“established a satisfactory level of competence. In the foreign 
eee a student who demonstrates by examination a rea- 
m io ur in a foreign ear p be exempted 
is meat, ыр requirement ш t т area. For several years 
Шен» ers of the social studies vision have been experi- 

& with a plan whereby a student with a high rank on a 


will 


34 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


standardized proficiency test may be permitted to select fror 
certain advanced courses in the division rather than be require 
to follow the normal route. 

At the beginning of each academic year the ability ап 
achievement status of the entering freshman class is review 
for the faculty. A composite picture of the students with whon 
the faculty is to work is furnished on the basis of appraisal ° 
test results. Comparisons with previous classes and wit 
national norms are made. The same technique has been use 
with regard to the Sophomore Culture Test. 

Achievement and proficiency testing services are 
ordinated in the student personnel office. Members % 7, 
faculty are encouraged to administer achievement tests W? al 
ever they are willing to do so and when there is assurance 1 K. 
the tests will be administered under controlled conditi | 
Administration of an achievement test by a teacher » e 
normal class situation rather than by an outside testing € 

~jpa 
tends to create a natural class rapport. Faculty particip’ 


: Lm а ; are m^ 
has increased in frequency and quality. Test results аг 44 


now c 
f tht 


"à 
= E 
SSS 


available to faculty members when they are requeste r^ pi 
attempt is made to render appropriate interpretations УШ? 
been our experience that faculty members have bee? 1 

to cooperate in recent experimental testing programs ёй, 


j als e 
| The General Educational Development Tests he? T 
given to students in various classes at the end of eac jon j 


4 ; . еге ai 
the United States Armed Forces Institute series W gh 


Project was carried on through the Veterans’ Testing Э ; | 
the American Council on Education in order to aid е? ed! 
cooperate but the filling of quotas made it unneces"- e y 
willingness to experiment with tests tends to familia vl 
Faculty with the scientific approach to the problem 4 to id 
ation of instructional methods and can be a distinct E an 
college in setting up an objective basis for acceleration ve 
fair accreditation in recognition of educational 4° un 
and maturity, 


TESTING IN STUDENT PERSONNEL SERVICES 35 


Tests in Research and Evaluation 


In an evaluation program in which an institution decides to 
analyze its student body and to appraise itself, one can draw 
heavily upon test results. But it should be recognized that 
test data are to be used with caution and that tests give clues 
which render only a partial analysis, Test results which are 
used to evaluate the quality of instruction can often be quite 
misleading, Other criteria need to be used also, A 

In the fall of 1943 the Committee on Educational Policy 
aunched an evaluation study of its student body which was 
Published as the Hamline Studies of 1943-1944 mainly for in- 
ternal and administrative use. To describe this research proj- 
€ct would be to go beyond the scope of this article. It can be 
Said, however, that a systematic testing program made possible 
а more objective evaluation than would have been possible 
Otherwise, A graphic analysis was made of the quality of 
Teshmen and seniors with respect to ability and achievement 
Over a period of several years. This feature of the research 
resulted in a number of recommendations and changes including 
3 more carefully defined and comprehensive program of ad- 
Missions and public relations, improvement and expansion of 
Counseling services before and after admission, and a challenge 
to the improvement of instruction in certain areas, 


Summary 


Tt would seem from the foregoing that testing can play a 
“UPPorting role in student personnel services and in certain 
related educational practices. One should not be left with the 
"éssion that tests alone can solve the problems of the col- 
* Tests will always remain only a part of the whole edu- 
tational structure. It has been our thesis that testing as an 
Integra] Part of student personnel services can play a founda- 
Чопа] Tole in individualizing a counseling and teaching program 
in evaluating many of the practices and procedures of 
tutions of higher learning. 


p 
lege 


insti 


COUNSELING AND THE USE OF TESTS IN THE 
STUDENT PERSONNEL BUREAU AT THE 
UNIVERSITY OF ILLINOIS 


H. W. BAILEY, WILLIAM M. GILBERT, anv IRWIN A. BERG 
University of Illinois 
Tuis paper is devoted primarily to a description of the use 
of tests in counseling in the Student Personnel Bureau at the 
University of Illinois. In addition, a brief account is given 
of the use made by the Registrar in admissions procedures of 
tests administered by the Bureau. In order that the reader 
may see the background in which tests are administered and 
used, it seems desirable to include an account of the functions, 
Staff and procedures of the Student Personnel Bureau as well 
as an indication of the number of clients and the kinds of 
Problems which they present. Though the Student Personnel 
ureau is the technical agency for a Veterans Administration 
dvisement Center, this paper will be limited to Bureau services 
to students and pre-college clients. 


Functions 


The Student Personnel Bureau was established in the Col- 
ege of Liberal Arts and Sciences in 1938 to supplement the 
Work of previously established personnel agencies and faculty 
members in counseling with individual students through the use 
of the best clinical methods, using standardized tests as one 
‘ool. Counseling of individual students remains the primary 
function of the Bureau though the clientele has been enlarged 
to include pre-college clients and veterans from the state of 

ois. Secondary functions of varying importance will ap- 
Pear later in the paper, but it will be noted that the Bureau has 
no administrative or disciplinary responsibilities. 

In 1942 the Bureau was made officially an all-University 

37 


38 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


agency and was placed under the administrative supervision of 
the Provost of the University; the Provost is, in practice, an 
educational vice-president of the institution. It thus appears 
that not only is the Bureau a service agency primarily but it is 
regarded by the University administration as an educational 
agency. The fact that all members of the counseling staff have 
teaching responsibilities indicates further the close relationship 
between the Bureau and the University’s educational program. 


Staff 


The staff of the Student Personnel Bureau is made up of 
four groups. The first is the central clinical and administrative 
staff. The budget provides for five full-time persons in these 
Positions, including a Director, an Assistant Director, and three 
Clinical Counselors, one of whom also has general supervision 
over the psychometric work. The Assistant Director and Clin 
ical Counselors are all trained in clinical psychology. All тет” 
bers of this group have academic rank in one of the teaching 
departments of the University and all have teaching obligations 
though in general not more than one course per semester. 

he second Broup consists of faculty counselors, and the 
budget provides for fourteen of them at present. They 2° 
гаво from the faculties of the various colleges and gail 
rom quarter-time teaching fo 2 i e Burea 
budget providing funds ir ho re oi i teachin 
е 

thus released by the departments. The faculty counselors p 
given a training course before they begin to see clients an m 
namig Is continued through regular staff meetings. Since f 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 39 


The fourth group is the stenographic and clerical staff who 
are on full-time appointment under the University Civil Ser- 
vice system. There are four budgetary positions in this group. 


Number and Kinds of Cases 


Since its inception, the Student Personnel Bureau has used 
а system of classification of student problems in three broad 
areas: educational, vocational, and emotional problems. It is 
а matter of policy for the Bureau not to require students to 
Come in, so that from the Bureau standpoint all contacts are 
voluntary on the part of the student. However, a case is 
classed as “referred” if the student comes to the Bureau at the 
direct Tequest or suggestion of a member of the University staff. 

€ percentage of new clients who are referred in this sense has 
remained markedly stable at about eighteen per cent. 

During the year May 1, 1944, to April 30, 1945, 1617 new 
Clients came to the Bureau and kept one or more appointments 
With members of the counseling staff. The following tabulation 
Shows the percentage distribution by areas and combinations 


9f areas of the problems which they presented. 


TABLE 1 
Area Percentage 
Educational . 244 
Vocational "^ 12:3 
Таор) е UON iris i 2:3 44.0 
*32.8 
31 
46 40.5 
Educational, vocational, emotional ......... 15.5 


The Sub-totals in this tabulation show that 44.075 presented 
Problems in one area, 40.5% in two, and 15.5% in all three. Of 
the total number of areas represented, one with some clients 
ry two or three with others, 44.294 were educational, 40.9% 
ational, and 14.9% emotional. The mean number of areas 
Per client was 1.71. 
еге have been few reports in the literature on the in- 
Y of student problems. The use of marked sensing cards 
Dternationa] Business Machines) has made possible classifi- 


Cati P 
9n to show the presence or absence of problems in each area 


tensit 


40 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


and their relative intensity. Preliminary classification is mad 
on the permanent record card by the counselor at the time « 
the first interview, with a final classification whenever th 
counselor feels he is in a position to make it. The followin 
code is used: 0—absence of problems in the area; 1—mil 
problems; 2—moderately serious problems; 3—serious prob 
lems. With four weights in each of three areas, there are 6 
possible combinations of weights, or weight marks, such as 02 
which means no problems in the educational area, moderate!) 
serious problems in the vocational area, and mild problems !Ї 
the emotional area. 

A spot check of two samples of 100 cach was made fro"! 
cases originating in the period from June 1943 through May 
1944. Both samples were compared separately and toget Я 
with the total group of cases originating in this period, and Г 
every comparison they were found to differ from the total pop” 
lation well within the expected range of sampling fluctuation 
under a chi-square criterion. In the first sample, 33 of ы oh 
weight marks were used; in the second, 32; in the two toget ‘a 
48. The weight marks used ten or more times in the va 
sample of 200 cases were 100, 120, 010, 110, 020. ‘These ^j. 
combinations of educational and vocational problems accou” 
for 52% of the total. The problem of maximum intensity ` d 
mild in 43.5% of the cases, moderately serious in 31.57%» * Й 
serious in 25.0%. The mean intensity (sum of the weight P. 
vided by the number of cases) was 1.56 in the educational p 
1.68 in the vocational area, and 1.72 in the emotional are. 


educational and vocational areas, is accounted for by th® ip 
that more than half of the freshmen entering in June ip! 
October 1943, and in February 1944 came to the Bureau t jo 
out the results of their Freshman Guidance Examinat pj 
which will be described in the next section. A client ° gt i 
group who is making a Satisfactory adjustment to coll 
commonly classified 100. T; is clear, however, that in t^^ jo 
dent mind the Student Personnel г 15 ‘not for рт» pl 
students only, This attitude is gratifying because the 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 4] 


staff has made a deliberate effort to cultivate exactly this 
feeling. 


Freshman Guidance Examinations 


Freshmen entering the University of Illinois are required 
by the several colleges which admit freshmen to take a battery 
of scholastic achievement and aptitude tests which go by the 
utle of Freshman Guidance Examinations. They are given 
during Freshman Week, after the student has registered. Hence 
they are not a part of the admissions procedure, but are de- 
signed as a preliminary basis for counseling with the individual 
Student. Freshmen are urged to come to the Student Personnel 
Bureau for an interpretation of the results. A description of 
the procedure from the student standpoint will be found in a 
later section. The Dean of each college is furnished with a copy 
of the results for his freshmen. 

As a result of statistical studies of experimental batteries, 
there have been several modifications in the constitution of the 
battery since it was first given. The battery for all freshmen 
except those in Engineering is made up at the present time of 
the following tests: American Council on Education Psycho- 
logical Examination, college form, scored for quantitative, lin- 
guistic and total; Van Wagenen Rate of Reading Test; Co- 
operative Mathematics, Natural Science and Social Science 

roficiencies, each scored for comprehension and total; and Co- 
Operative English Mechanics. The three comprehension tests 
are also combined to give what current research indicates is 
Probably a reading comprehension score; the three Proficiency 
total scores and the English Mechanics are combined to give a 

igh School Proficiency score; and the entire test battery yields 
2 Composite score. The results are given to the counselor in 
Terms of raw scores and centile ranks on University of Illinois 
Norms by colleges. : 

he battery for Engineering freshmen is made up of: 
Merican Council on Education Psychological Examination, 
College form, scored for both parts and total; Cooperative 
athematics Proficiency, scored for comprehension and total; 
°Oberative Mathematics Survey; Cooperative English Me- 


chanics, Minnesota Paper Form Board; and Bennett Test of 


42 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Mechanical Comprehension, form BB. The entire test battery 
is combined to give a Composite score. 

Each of these test batteries has an administration time of 
about five and one half hours. Further differentiation of the 
Freshman Guidance Examination battery, especially for fresh- 
men in Fine and Applied Arts, is desirable and is being studied. 

The great majority of freshmen take the Freshman Gui- 
dance Examinations in the group testing during Freshman 
Week. However, increasing numbers of high school seniors and 
graduates are requesting pre-college counseling and the Fresh- 
man Guidance Examinations form an integral part of the test 
ing done with such clients. Pre-college counseling goes ОП 
throughout the year, the heaviest load coming in the perio 
from Christmas to the end of the first semester and during the 
summer months. 

Procedure 


John Jones is coming to the Student Personnel Bureau tor 
the first time, having heard of its services from one of his 
friends. Let us follow him and get a general view of Bureau 
procedure. 

When Jones comes to the receptionist to make an appoint 
ment with one of the counseling staff, he is asked to fill out ап 
individual information record. This form of three mimeo 
graphed pages covers the following: name, age, local айй, 
educational record, including high school attended and quart! 
rank in his graduating class, other schools and colleges atten м 
and the course taken, and scholastic status; military servic 
and specialized training; work experience; declared vocation? 
interests, with reasons; main reasons for coming to CO 
siblings with age, education and occupation of each and ma 
status of parents—together, separated, divorced, remé 
amount of study, study efficiency, and outside work; 
list of personality traits; a check list of diseases and nev 
symptoms; and purpose in coming to the Bureau. lof: 

On the basis of Jones’ request for any particular counse у 
the college in which he is registered, the check list of persona 
and neurotic symptoms, and his purpose in coming to © 
reau the receptionist makes an appointment for Jones va 


гі 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 43 


of the counselors. Prior to the time of the appointment, the 
counselor has Jones’ folder which contains the individual in- 
formation record and the results of any tests such as the 
Freshman Guidance Examinations. The counselor therefore 
has a considerable body of information about Jones before the 
Interview. 

The first interview has as its first objective at least a tenta- 
tive determination of the problems on which Jones is seeking 
Counseling. In many cases the problems are easily identified, 
but in some cases the client may be unable to bring out the 
basic problems until after several interviews and then only 
Piecemeal. Of course the basic problems may be quite different 
from those stated on the individual information record; this is 
especially true of the more deep-seated personality problems. 

When, through discussion, the nature of Jones’ problems 
begins to become clear, a joint decision must be made as to what 
further information is needed. If the counselor feels that 
needed information can be obtained from tests, he explains the 
nature of the tests which he would like to have administered 
and their possible use in providing information. Unless Jones 
Can see the possible utility of the tests to him, he is likely to 
forget to take them or to return to see the counselor. If Jones 
indicates his wish to get the further information which the tests 
Ty provide, he is given a card with the tests agreed upon 
Checked which he takes to the testing room when it is con- 
Yenient for him to start testing. On completion of the tests, 

Ones returns to the receptionist to make another appointment 
With the counselor. 

Again the counselor has Jones’ folder containing all the 
?vailable information before the time of the appointment. One 
9f the tasks of this interview is to interpret the test results and 

eir implications to Jones. Test results are not considered by 
themselyes but as a part of the total picture. Inconsistencies 
9! pattern are brought out and any conflicting evidence is dis- 
Cussed. Tf still more information from tests is found desirable, 
Ones may decide to take still other tests. Or, no further tests 
Ог Interviews may be indicated because Jones sees an answer to 
'S Problems. Or, there may be need of repeated interviews, 


44 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


sometimes lasting over months and involving as many as forty 
interviews. The determination of procedure in each case 1s an 
individual matter between the counselor and the client. 


Use of Special Test Batteries 


Mention has been made of the Freshman Guidance Ёх- 
amination battery as a preliminary basis for individual coun- 
seling. This battery is used with other tests with two special 
groups of students and other special batteries have been 
developed for specific groups. 

Under Board of Trustees regulations a student who ranked 
in the lowest quarter of his high school class enters the Uni- 
versity of Illinois on probation. In connection with his first 
registration he is required to take such tests as may be pre- 
scribed by the Student Personnel Bureau. On registration he 
is placed under the special supervision of the Dean of the Col- 
lege in which he enrolls, and may be required to carry a reduce 
program or a program especially arranged to meet his needs: 
The test battery given to this group is made up of the Freshman 
Guidance Examinations and the Kuder Preference Recor 
The results are given to a special counselor in the student $ 
college who arranges a program in consultation with the student 
in the light of the test results and within the special regulation? 
of the college. A study of the performance of the lowest 
quarter students is found in (2). "v 

Under Board of Trustees regulations, a student in the ш 
est quarter of his high school class, who has completed at ao é 
14 units acceptable toward admission in the curriculum 
desires to enter, including all the subjects especially presct! 
for admission to this curriculum, and who is recommence he 
a committee of his high school faculty, may be admitted t° е 
University on demonstrating that he possesses the intellect to 
ability, Social maturity, and emotional stability essentia pe 

, Success Іп college by passing satisfactorily such tests as may it 
prescribed and administered by the Student Personnel Bure? c 
In general, a rank below the 75th centile on University И 
Ilinois norms 1s cause for denial of admission under this P us 
of acceleration. The Freshman Guidance Examinations 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 45 


the Harrower-Erickson Multiple-choice and the Kuder Ртејет- 
ence Record form the battery used under this plan, the clini- 
cian also depending upon an interview for a check on the 
social maturity and emotional stability of the student. A 
study of the performance of a group of accelerated students is 
found in (1). 

The Student Personnel Bureau acts as an auxiliary to the 
Registrar in two kinds of admission problems with veterans 
through the administration of The United States Armed Forces 
Institute Tests of General Educational Development. Such 
Cases may or may not involve counseling, depending on indi- 
vidual circumstances. 

A veteran who is not a high-school graduate or is a graduate 
of a non-accredited high school and who applies for admission 
to the University, is referred by the Registrar to the Bureau for 
the high-school form of the General Educational Development 
Tests. The Bureau reports the scores on all five tests to the 
Registrar and admission is granted, either with clear status or 
9n probation, or denied on the basis of the results in accordance 
With predetermined standards. However, the counselor may 
recommend to the Dean of the College certain courses which 
Should be taken by the student in his first semester or year in 
Order to remedy deficiencies in preparation disclosed by the 
high school record and additional tests of scholastic achieve- 
ment. [n several cases, the Bureau has been requested to 
Certify the results of these tests to the high school which the 
Student attended before entering service as the basis for 
&raduation from high school. | 

A veteran applying for admission to a college or curriculum 
With stated qualitative standards, whose previous college 
record is below the minimum, is referred by the Registrar to 
the Bureau for the college form of the General Educational De- 
velopment Tests. The results are reported to the Registrar 
Ог action on the application in accordance with fixed minimum 
Standards. In a few isolated instances college credit has been 
ranted on the basis of performance on the college form of these 
ests, 


At the time this is written there has been no opportunity to 


46 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


form a judgment as to the effectiveness of admissions pro- 
cedures using the General Educational Development Tests 
because they were introduced in the fall of 1945. 

As a result of a statistical study made in the Student Per- 
sonnel Bureau of the relation between scores on the Cooperative 
Mathematics Proficiency Test and grades in college algebra 
courses, the Department of Mathematics used the test the first 
semester to counsel with veterans on whether they were pre- 
pared for either of the two college algebra courses with differ- 
ing prerequisites which the Department offers, or whether they 
should enroll first in a special non-credit course in elementary 
algebra and plane geometry. The response from the veterans 
was so favorable that the Department is requiring all veterans 
who expect to take college algebra the second semester to take 
the test, administered by the Bureau, before registration tor 
use in planning individual programs. | 

Since student nurses in the training school of one of the loca 
hospitals take eighteen hours of course work at the University 
of Illinois as part of their nursing course, all probation il 
tested by the Student Personnel Bureau shortly after аб” 
sion to nursing training. The tests used are: the America’ 
Council on Education Psychological Examination, college form 
scored for both parts and total; the Kuder Preference Recor s 
the Harrower-Erickson Multiple-choice; and three nursi” 
tests prepared by Thelma Hunt, Aptitude, Arithmetic, E 
Reading Comprehension, 1940 edition. This battery at 
proved to be particularly useful in counseling with St" ' 
nurses interested in one of the nursing specialties. ат 

In the fall of 1942 a special testing and counseling pre a 
for graduate Library School students was organized. B 0^ 
battery included: the American Council on Education m d 
logical Examination, college form, scored for both parts che 
а, the Cooperative General Culture Test, except for is 
mathematics section, with reduced time limits of twenty s ] 
re ador н Tenu Clerical Test; ү ee pas 
of murtillithed ng ocational Interest Blank. On! et^ 

publshed statistical studies, the Minnesota Clerica rll 
sonal Audit, and science section of the Cooperative 


ers are 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 47 


Culture Test have been dropped from the battery and the 
Kuder Preference Record, the Harrower-Erickson Multiple- 
choice adaptation of the Rorschach, and the Moss Social Intel- 
ligence have been added. These tests have been utilized in 
counseling with the library students and they also play a role 
in the admission of certain students. 

Special test batteries have likewise been utilized in connec- 
tion with occupational therapy students. A test battery con- 
sisting of the American Council on Education Psychological 
Examination, the Bennett Mechanical Comprehension, the Co- 
operative General Culture (parts I, IV, V, with twenty minutes 
per section time limit), the Harrower-Erickson Multiple-choice, 
and the Kuder Preference Record, was administered in 1945 to 
graduate students in the government-subsidized emergency 
courses in six of the leading occupational therapy schools in 
the country, including the University of Illinois. This test 
battery gives promise of usefulness in the selection of students 
for occupational therapy training generally. To the regular 
Occupational therapy students enrolled at the University of 
Illinois, the same test battery was administered in 1945 with 
the addition of the Moss Social Intelligence Test. These tests, 
as well as the Freshman Guidance battery, are used not only 
for counseling purposes but also in conjunction with admission 
to the curriculum and continuance in it. 


Counseling Procedures and Policy 


Since the establishment of the Bureau a genuine clinical pro- 
cedure has always been utilized in any case where counseling 
Occurs. All known or discoverable factors in a student's prob- 
lem are taken into consideration, including family background, 
Social development, health history, school record, emotional 
Status, and the like, as well as the results of psychological tests. 

hile everyone would agree that a clinical approach is the 
Only desirable one, the clinical approach is emphasized here 
€cause all too frequently undue weight is placed upon the ad- 
Ministration and interpretation of various psychological tests 
With a relative disregard for other factors which are equally or 
‘ven more important with respect to the individual's total ad- 


48 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


justment. It is a common experience to find a discrepancy 
between the student’s felt and stated interests and his measured 
interests and aptitudes. The motivational factors represented 
by such statements and expressed feelings must be carefully 
weighed and discussed in connection with the other evidence. 
In those instances where a student’s felt interests are clear-cut 
and strong and where they do conflict with measured interests, 
aptitudes, and achievements, no attempt is made authorita- 
tively to urge the student to change his program in accordance 
with the objective results. The emotional problem created 
by such a forced change would in most instances more than 
counterbalance the higher measured interests and aptitudes for 
the given field. 

© All Bureau counselors, including part-time faculty coun- 
selors, recognize the desirability of a clinical approach, and this 
approach is invariably utilized. So far as more specific counsel- 
ing procedures are concerned, the non-authoritative, ПОЛ" 
directive approach is generally utilized since a conclusion at 
rived at by the student in joint discussion is much more apt t? 
lead to appropriate action than a decision which is primarily 
the result of a pep talk or sales talk on the part of the counselor: 
A non-authoritative approach can and should be used generally» 
not only with students suffering from emotional disorders 
but also with students who have educational or vocation? 
problems. 

1 However, clinical experience also indicates that while а ПОЛ" 
directive approach is generally the most desirable, it is 7° 
always so, There are occasions even in counseling in the ment 
оета жыв зыр Песоа ае 

true, for example me s le and more useful. This E in" 
Baai feelings » Ww a a who suffers from seri е been 
worked through d a irm rx баре lying probleme hav sees 
possible and desirab] К азе ary - нана. wee ; 
couragement, and A e is need information, у ich 
social groups before rins т ing the necemany ete actio”: 
In such instances, there is « h 3 cil aped A and 
even exhortation in hel 5 no hesitancy in using persuasi? „„ 
ping the student “over the ҺР: 


i 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 49 


Bureau counseling procedure is, therefore, eclectic. The 
counselor may even shift several times from a completely non- 
directive to a completely authoritative procedure and back to 
à non-directive one in the course of an hour’s interview. He 
may supply the student with vocational or other information 
or he may tell the student where he can get such information. 
The approach is the pragmatic one of utilizing whatever seem 
to be—in the light of the whole clinical picture as it unfolds— 
those procedures which most efficiently aid the student, and 
there is no hesitancy in shifting from one type of approach to 
another if the first does not produce the mutually desired 
results, 

Educational Counseling 

Some of the commoner problems which the counselor meets 
in the educational area are organization of time, development 
of effective study habits, both in general and in specific areas, 
Need for remedial reading instruction, determination of level 
of scholastic ability and of areas of special ability and dis- 
ability. But Table 1 shows that educational problems appear 
in connection with vocational or personal problems, or both, 
twice as frequently as alone. 

The Freshman Guidance Examinations give direct informa- 
tion on the need for remedial reading, level of scholastic ability 
and areas of special ability and disability. Further diagnosis 
m the reading field is left to the clinician. However, poor 
reading skills in a special area, such as mathematics or chem- 
try, may account for classroom performance which of itself 
Suggests insufficient study or disability in the area. Here the 

Teshman Guidance Examinations may serve to eliminate some 
of the Possibilities. As an example, a freshman with D work in 
ollege Algebra and C and B work in his other courses ranked 
uniformly above the 60th centile on all the Freshman Guidance 
з Xaminations, including two mathematics tests, and was study- 
8 some two hours per assignment. The counselor found that 
© student was merely scanning the expository material in the 
algebra text and trying to memorize meaningless formulas as 
tools for working problems according to the illustrative ex- 


amples. This is clearly a place where specific reading help is 
Needed. 


50 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The level of scholastic ability is of importance in several 
situations. Here is a student who is barely doing acceptable 
work and whose Freshman Guidance Examinations place him 
uniformly in the highest quarter of the norm group. A second 
student, whose Freshman Guidance Examinations place him as 
a borderline college risk, is studying from forty to fifty hours 
per week in a vain endeavor to maintain an average above В, 
and is getting discouraged and growing tired of the monotony 
of college. A pre-college client is debating whether to try CO” 
lege work at all, or to take up an apprenticeship in a trade where 
he has some experience and demonstrated ability. The cour- 
selor’s problem with the underachiever is to help him find out 
why he is achieving so far below capacity, help him fix suit. 
able goals, and to help him reach these goals through remedia 
measures. With the second student it is to help him p* 
scholastic and vocational goals consonant with his ability. | 
both cases, interpretation of the test results in the light 0 i 
the other evidence is necessarily informative, but the mor 
dificult and more basic counseling begins after discussion 
the evidence, and it is non-directive or else wasted. Discuss? 
of the evidence alone may be sufficient to enable the thir 
student to come to a decision. 

The two most common problems of the entering fresh" 
are organization of time and development of effective w. 
habits. Both are complicated by the unrealistic picture ™ es 
minds of many freshmen of the demands which college mia 
as compared with high school in the way of more inten, 
work, greater speed of learning, longer assignments, and cour pd 
ашы two to three times as much study. To the student ouf 
ат а in high school with eight to oa os 
eines tis ,it и, ы a shock to be к in sud 
GP thet be me ваш about twenty-five hours a wee week } 
class, exon: e Мрав If 

Dia the fisiche d à of Md 
in башы dries са freshman who feels the m os of aa 
sitet ө sort ey it may be sufficient for the re coun? if 
calling the student’ a time schedule together, the ^ ей? 

ent’s attention to the reasons for doin’ 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 51 


things at certain times and the modifications which can be 
made when necessary. The student who has not learned to 
plan his use of time after one or two semesters presents a more 
difficult problem, and here the use of a weekly time chart which 
the student keeps for three or four weeks and discusses with the 
counselor weekly is frequently useful. The basic responsibility 
is the student’s; he must develop sufficient self-discipline to 
Carry out that which he knows he should do. Exhortation by 
the counselor is fruitless. 

The use of Wrenn’s Study Habits Inventory is frequently 
helpful to the student who wishes aid in developing effective 
Study habits, by isolating those areas which need particular at- 
tention. The counselor may then give suggestions directly or 
may refer the student to specific sections in one or more of the 
Standard works in this field; Wrenn and Larsen, Studying 
Effectively, has proved especially useful. 

In addition to the Freshman Guidance Examinations and 
other special batteries described previously, the counselor has 
available a large number of other standardized tests of general 
ability, achievement, ability in special fields, and aptitude. If 
there is evidence that a student works too slowly to do himself 
Justice on speed tests, power tests may be administered. The 
Wechsler-Bellevue Intelligence Examination is very useful if 
ап individual test is desired, especially if the counselor suspects 
the existence of a considerable verbal-performance differential. 
the report of the test administrator may contain significant 
Information relating to the personality. Other tests than those 
already mentioned which are commonly used include: the 

innesota Test for Clerical Workers, the Iowa Silent Reading 
71, the Iowa High School Content, and the Ohio State 
niversity Psychological Test. 


Vocational Counseling 


While a considerable number of interviews may be neces- 
Sary, those students who have made no vocational choice or 
0 are uncertain of their vocational goals rarely represent а 
complex counseling problem if there are no complicating factors, 
Such ag family pressure. The Kuder Preference Record and the 


52 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Strong Vocational Interest Blank are usually assigned routinely. 
During a conference various jobs are discussed in the light 
of all the evidence, including Freshman Guidance Examina- 
tions, vocational interest tests, the scholastic record, and work 
experience. The length and nature of training, rate of pays 
conditions of work, and personality requirements of each job 
are briefly outlined. Vocational choices are thus narrowed to 
several strong possibilities. For further information the stu- 
dent may be referred for further tests, printed vocational та» 
terial, conference with one or more faculty members in the areas 
under consideration, and exploratory courses in the University 
'The determination of what sources of information are fO 
used is decided individually. The tests used include the Per 
sonal Audit, the Minnesota Multiphasic Personality Inventory: 
the Minnesota Mechanical Assembly Test, the Seashore Mer 
sures of Musical Talent, and various tests of clerical aptitu ч 
A further interview or interviews may be needed for the ейт” 
nation of some of the possibilities and for final determination ° 
a vocational goal. Where a semester or more has elapsed i 
tween vocational interest testing and a final vocational decisio? 
one or both vocational interest blanks may be reassigne© | 
order to check for possible shifts of major interest areas: x 
nificantly, such shifts are fairly common, especially where th 
period between testings was spent in military service. |; 
But in approximately half the cases of vocational couns®, 
ing one or more complications arise. A common complicatio" 0" 
that of vocational stereotypes. When the federal nurse са et s 
gram was operating, for example, it was not uncommon i d 
students enter the counselor's office to ask only about а tu^ 
sion requirements for this program. In encouraging such A 
dents to talk about nursing, it quickly became clear that e 
» them pictured themselves as visions in white, smiling oe of 
peutically at a handsome soldier. Such things as emes!’ g 
suppurating wounds were wholly removed from their MAT 
e concept of nursing. It may be added parenthet? ре 
heard and cg ror al 
ntributing to this condition of voc 


4 
à gt 
stereotypy. The same thing is met in other areas h 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 53 


dents often express impatience with curricular requirements as, 
“What good is physics to a surgeon?” or perhaps, “Why should 
an engineer take rhetoric?” 

Perhaps the most curious stereotype is that concerning the 
industrial fields of electronics and plastics. Veterans in par- 
ticular sometimes state with the calm assurrance of one who 
has carefully reached a decision, *I am going into plastics." 
Such assertions are followed by statements that *it is a coming 
field... .” and the student talks of electronics and plastics as 
if they were professions or trades in themselves. That one can 
be a chemist, an accountant, a machinist, or even a janitor in 
"plastics" frequently comes as a startling revelation to such 
students. 

In dealing with vocational stereotypes the problem is es- 
sentially one of getting the student to become more objective 
and less emotional about his vocational decision. Such ques- 
tions, for example, as “Tell me what different kind of jobs there 
are in plastics," often assist the student to the gradual realiza- 
tion that there is no vocation of “plastics” and lead him to 
Consider special fields which may offer special training in 
plastics. 

Another complication confronting the counselor who works 
With student problems in the vocational area concerns what 
Berg and Gilbert (3) call the “white collar" halo. This halo 
effect, while most commonly found among high school stu- 
dents, also frequently appears at the college level. It appears 
as an emotionally toned unwillingness to consider jobs other 
than those which are clearly “white collar” in nature. Thus a 

: Student who could earn a very comfortable living as an electri- 
Clan or toolmaker because of an apprenticeship already served, 
18 sometimes strongly determined to become a teacher at half 
the Pay even though tests indicate he is not college material— 
all because he wants to work in clean clothes. 

. Also emotionally toned is the occasionally encountered be- 
lief that by determination or will power anyone can succeed 
academically. A student who tests in the lowest five per cent 
| Niversity of Illinois freshmen and who graduated in the 

West tenth of his high school class may sometimes insist, “I 


54 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


know I can get through medical school. I never really worked 
hard before; now I mean business!” It often happens that 
where the student is convinced he can get through medical oF 
some other school by “will power” the counselor can do nothing 
except make sure that the student has considered alternative 
vocational plans in the event of failure. But in other cases the 
counselor is able to assist the student in re-directing his voca- 


tional aims so that the aims are more in accord with his 
abilities. 
Table 1 shows that three out of every four vocational prob- 


lem cases at Illinois overflow into the educational or emotiona 
problem areas or both. A student engineer, for example, may 
request help in choosing another profession. During the initia 
interview it may be learned that he is flunking all his courses 
(educational area) and that he is quite disturbed because hi 
parents have flatly refused to permit him to change his course 
of training (emotional area). Thus the original vocation? 
problem really involves three areas instead of one. In mS 
cases the counselor must frequently work with members of th 
student’s family and with various officers of the University». 
well as with the student himself. Vocational problems whic? 
also involve personal or educational problems, as in the ® 
ample just given, are met with sufficient frequency that it” 
inconceivable that any counselor could restrict his activities of 


vocational problems alone and still do an adequate I 
counseling. 


Mental Hygiene Counseling and Psychotherapy Gt 
t 


den p train treatment by the amdos of РЁ up 
veis T. n = Bureau of civilian students and veteran. al” 
dics A from various types of emotional and ot dis” 
тл, н special attention. The variety of suc yd 
ax slight нз gamut from relatively “normal” problems jo 
sickness) en та баси examination tension АП an 
шл. seri the more serious psychoneurotic distu" jen 
give rise to ph ation hysteria in which psychological Md 
ea ysical complaints, phobias, obsessive th! оа 

У states), to the sexual abnormalities, ай б 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 55 


incipient psychotic disorders such as depressed states involv- 
ing suicidal tendencies and the schizoid reactions of the ex- 
cessively introverted individual. 

In all cases where medical diagnosis and treatment or 
hospitalization seem desirable the individual is referred to the 
Health Service or to private clinics and physicians and to the 
Neuropsychiatric Institute of the Medical College. The num- 
ber of students who must be referred for hospitalization and 
Shock therapy or other similar medical treatment is extremely 
small, the average being fewer than two students per year. 

This means, of course, that special care is exercised to dis- 
cover those students who need psychotherapy before their emo- 
tional disturbances become so extremely serious as to require 
the more drastic type of treatment or hospitalization. 

There are at least three channels whereby the early dis- 
Covery of these emotionally maladjusted individuals is made. 
First of all, all faculty counselors receive through the training 
Program a rather complete practical understanding of the na- 
ture of such disorders, and of the symptoms which indicate a 
relatively severe disturbance. They are therefore alert to such 
Symptoms when they appear, for example in an interview in 
Which the student has simply requested an'interpretation of 
the educational and vocational significance of the Freshman 
Guidance Examinations. In such cases the faculty counselor 
may do one of a number of things. He may attempt to get the 
Student to *open up" regarding his personal problems and pro- 
ceed with counseling in this area as far as he feels competent or 
Until he feels he can successfully refer the student to one of the 
Psychologists on the staff. On the other hand he may assign 
9ne or more of the standardized tests or inventories of person- 
ality or adjustment in connection with counseling, which at 
this point is still primarily vocational or educational in nature 
and thus postpone exploration in the mental hygiene area or 
referral until a later interview. In obviously severe cases, 
“чы to the psychologist тау be made at once by phone. In 

Cases the psychologist makes every effort to see the client 
at same hour. 
any of the deans and assistant deans as well as a good 


— C 


ч "тт MENT 
56 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMEN 


number of the faculty members generally are also ae, 
aware of the commoner symptoms of emotional disorders E 
they often refer students with such symptoms directly to 
ologists. 
is fas not seemed desirable to include any of the X" 
available tests of adjustment in the Freshman ree ^a 
tery. But the receptionist who receives the individua le A 
mation record from the student who is making his ac ай 
pointment with a Bureau counselor can glance at the ps ке. 
phrases underlined in check lists of personality traits anc У В 
ical symptoms while she reads the student’s statement = che 
purpose in coming to the Bureau, since they all ri gend - 
same page. If more than a given number of the dd i 
designated unfavorable items are underlined, the stu to an 
referred directly to a psychologist just as he would be de 
other counselor. Since all of the psychologists necessar ү E, 
some purely vocational and educational counseling е а 
distinction in the minds of most students between the! 
any of the other counselors. hother? 
Before proceeding with a description of the aerem e 
peutic techniques utilized with the more severely e й 
clients, it should be emphasized here that a considerable te non" 
of very effective mental hygiene counseling is done by ат депо 
Psychologist faculty counselors. The under-socialized S m 
the student who is having trouble emancipating imet many 
the family, the student who is unduly self-conscious an¢ ; 
others are so helped by the faculty counselors that 
psychoneuroses are undoubtedly avoided. ing with 
The psychotherapeutic approaches utilized in worki as d 
clients with the more serious disorders are as wu of 0 
knowledge of the psychologist permits and as the nee an 


: en ^ 
particular student demand. Various tests of adjust! re sub 
personality are re 


servient to ther 
been mentione 
in the general 
sign adjustme 
port has been 


apy, except in those special instances W me 7 
d previously where a personality test 15 ot t° " 
test battery. It is a regular practice : o K 
nt tests to individual students until 8° рег | 
established and even then only when 


gularly utilized but most often they | nav 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 57 


a full understanding by the client of his personal need for the 
information which could be supplied by such tests. In many 
instances the responses to individual questions receive more 
attention than the total scores. 

Of the questionnaire type of tests, the Minnesota Multi- 
phasic Personality Inventory, the Bell Adjustment Inventory, 
both student and adult forms, the Bernreuter Personality In- 
ventory, the Adams and Lepley Personal Audit, and the 
Mooney Check List are most frequently assigned. The full 
Rorschach as well as its adaptations for group work and, less 
frequently, tests such as the Murray Thematic Apperception 
are used with the more disturbed clients when the counselor 
Meets an impasse in psychotherapy or when he desires objective 
diagnostic information. 

The only Bureau policy with respect to psychotherapy is 
entailed in the selection of clinical psychologists for staff mem- 
bers who are not blind adherents to any particular type of 
PSychotherapeutic approach and who can adapt their tech- 
Niques to the individual student, to his peculiar problems and 
to the momentary demands of the counseling situation. While 
It is true that a non-directive procedure such as that described 

Y Rogers (6) is generally utilized, other techniques are also 
Used where it appears that they would be more effective. 
Adolph Meyer's (5) distributive analysis and synthesis pro- 
cedure is sometimes used. Relaxation therapy, re-education 
and reconditioning, explanatory therapy, bibliotherapy, hypno- 
therapy, environmental manipulation, persuasion of the sort 
originated by Déjerine (4), and suggestive therapy are all 
utilized as needed. 

. Several types of therapy may be utilized with the same 
client at different times or simultaneously, and this may all be 
combined with educational and vocational counseling. For 
“xample, Miss X was referred to the Bureau because of fre- 
quent fainting attacks and low grades. She indicated that she 

Islikeq the curriculum in which she was registered and wished 

° in another, Her parents objected to a change. | Numer- 
eo instances of domination by them were cited, including 

tempt severely to select her playmates and later her boy 


оп 


чао 


58 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


friends, which led to social withdrawal and in 
homosexual desires which were associated with the E 
Sex education had been completely neglected. because o B 
parents’ puritanical attitudes. She also suffered t gra is 
stomach distress, daily headaches, and “nervousness, 0 ess 
a slight tremor of the hands and a general feeling of ш 
and anxiety. Freshman Guidance Examination results € a 
general ability and achievement above the 75th percentile 2 
Illinois freshman norms. Reading rate was in the banat Өй 
per cent and she complained that she did not have enough t! 
to get over her assignments. 10° 
Without going into detail, it can be stated that the P s 
cedure with this student included non-directive counseling а 
about three-fourths of the time spent in counseling mene б 
but also persuasion (for a complete medical examination "at 
laxation therapy (a reassuring temporary expedient Фе, 
toward control of nervousness), environmental ^ us j 
(change of roommate), re-education (learning to = (1e 
bibliotherapy (sex information), educational counseling o, 
garding reading skill), explanatory therapy (the effect 0 seling 
tional conflicts on bodily functions), vocational coun with 
(discussion of measured aptitudes and vocational interest yal 
Miss X and her parents), and suggestive therapy (homos 
desires will disappear as social adjustment improves): est? 
This one example could be multiplied many times, a ai 
Wrong impression be created it may be well to emphasize vit! 
that, in general, most of the counseling time is taken p i" 
non-directive therapy. Even where it seems necessary, jo a 
an interpretation of the origin and nature of some condita 
the homosexual tendency in the case above, this is ge аб 
complished not bluntly but by means of questions which £ n? 


1 є ou : 
ea the client to a self-interpretation which he “esti”, 
ma p sumens the help of the selective and guiding 9 ja” / 
РЕ. ч never does an emotional disorder appear 10 : » e 

indi : a 
45 Indicated in Table 1, the emotional problem apP*? 
Junction with 


h 

pot’ y 

educational or vocati blems, 01 [0 
atl ro ‚“ 

more than 90% cational p 


n 2 
lem. Clini of the cases where there is an emot!o ems of 
* Clnical records show that not only do these P? 


COUNSELING TESTS IN STUDENT PERSONNEL BUREAU 59 


pear together but that they are so closely interwoven that 
Piecemeal treatment by different counselors is out of the 
question. 

A check on a small number of veterans who entered the 
University in the fall semester of last year showed that as com- 
Pared with the findings for our whole student body, more of 
the veterans had emotional problems. These problems tended 
to be more severe and in a significantly greater number of cases 
they were associated with educational and vocational problems. 

It is therefore obvious that the individual professionally best 
fitted to treat a student or other person who has emotional or 
So-called *mental" disturbances is a clinical psychologist who 
has had sound training and experience in both the nature, use, 
and interpretation of various psychological tests which are 
essential to effective educational and vocational counseling and 
 psychotherapeutic procedures. As the general public be- 
Comes acquainted with what the well-trained clinical psycholo- 
8ist has to offer in this area and with the results produced, there 
Vill be a greatly increased demand for his psychotherapeutic 
Services by educational institutions and in private practice. 

During this school year a sharp increase in the number of 
Students with emotional problems and in the complexity and 
Severity of the resulting disturbances has been apparent to the 
Staff of the Bureau. During the past two months, for exdmple, 
nine students with definite suicidal tendencies have been in- 
cluded in the case load at the Bureau. It is possible that the 
'Ncreased number of serious emotional maladjustments is an 
artifact of the gradually expanding awareness of the services 
° the Bureau. The increase is so noticeable, however, that it 
Seems more probable that the relatively chaotic conditions of 
18 Postwar period are primarily responsible for the increase 
= a still greater increase can be expected in the very near 


Summary 
The Student Personnel Bureau at the University of Illinois 
an all-University agency regarded by the administration as 
con, tional in character, whose primary function 15 individual 
‘seling. The clientele is made up of University students, 


is 


чаа 


60 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


pre-college clients and veterans from the state of Illinois. Be- 
cause counseling is done in the educational and vocational areas 
as well as the area of emotional problems, students appear t9 
attach no stigma to use of Bureau services. The technica 
staff is made up of clinical psychologists, faculty counselors 
trained in the Bureau, and test administrators. The арргоас 
to counseling is invariably clinical; therefore, tests are regulary 
utilized for the information they can give, but they are d 
terpreted in the setting of the entire clinical picture. E 
batteries have been developed for specific purposes, both К 
counseling and for use of the Registrar in admitting students 1 
the University or to particular curricula, and may be ч 
ministered either оп a group or an individual basis. Addition { 
tests are used in counseling with individuals as the need t 
pears. Bureau counseling technique is eclectic, with the d 
est emphasis on non-directive methods but with use of dims 
methods whenever indicated. Present demands for hae. 
services are pressing, and indications are that the deman¢s 
continue to increase for several years. 


REFERENCES 
1. Berg, I. A. and Larsen, R. P. “A Comparative Study of 5 " 
Entering College One or More Semesters Before гае ед1!" 
from High School.” Journal of Educational 
XXXIX (1945), 33-40. ‚ыле 
2. Berg, Т. A., Larsen, R. P., and Gilbert, W. М. “Achiever The 
Students Entering College from the Lowest Оян rict 
High School Graduating Classes.” Journal of мо Pe 
Association of Collegiate Registrars, XX (1944), > Coll 
3. Berg, I. A. and Gilbert, W. M. “Discarding the ‘W at , І 
Halo.” Journal of School and College Placem 
... (1943), 57-60. ет?) 
4. Déjerine, J. and Gaukler, E. Psychoneurosis and Psycho! 


tuden” 
ation 


Л 


. , Philadelphia: Lippincott, 1913. ac 
5. Diethelm, O. Treatment in Psychiatry. New York: М w 
lan; 1936. y; Но! 
6. Rogers, C. Counseling and Psychotherapy. New yore 


ton-Mifllin, 1942 


THE USE OF TESTS AT MACMURRAY COLLEGE 


WENDELL S. DYSINGER 
MacMurray College 

_ Tue testing program at MacMurray College may be divided 
to five parts. Tests are used for the admission of students 
Whose high-school grades give rise to doubts concerning the 
Promise of college work. A battery is also given during the 
teshman orientation period, having educational guidance as 
its fundamental objective. A battery of vocational tests is 
added to these results for any student who makes request. The 

ational College Sophomore Tests are administered to all 
Sophomores in cooperation with the national program. Other 
tests are used from time to time for special purposes, as the 

taduate Record Examination and the Medical Aptitude Tests, 
and in examinations by departments of the College. 


Tests for Admission 


In admitting students to the College, the fundamental con- 


n the campus is reasonably confident. A student in the low 
third of his high-school class gives no basis in his grades for a 
“Yorable prediction. Those in the middle third of the high- 
Schoo] class are in an intermediate position. Their average 
Sra €s on the campus have proven to be below those of the 
igh third, but a substantial number make satisfactory records 
ith this type of background. I 
€sts for admission are offered to two groups of applicants. 
can, Students in the low third of high-school classes have the 
ach, “city for college work. Frequent transfer during the high- 
ti ool years, high grades in certain subjects, or a recommenda- 
On from the high-school principal may offer evidence that the 
61 


62 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


grades in the secondary-school record should not be the final 
basis for decision on the application. While such students are 
refused admission on the basis of high-school grades, a few 9 
them are offered an opportunity to take tests for admission: 
Some students in the second third of high-school classes give 
little better basis for a favorable prediction than do those in e 
low third. If they are low in the second third or give in e. 
recommendations or course selection reason to doubt i 
ability to do college work, tests for admission are required “ 
supplement the high-school record. ait 
The tests which are used for this purpose have been ^ Y 
ministered in the freshman orientation batteries. We know 59 
critical scores of students on these tests. The Stanford-Bi* 
Scale is frequently administered, especially when the quus] 
reading tests are low. The critical score which we have айр 
in the Stanford-Binet intelligence tests is an intellige 
quotient of 118. few 
From these testing procedures, the College has found а=, 
good students. One student, for example, who has had а 


Р > las" 
average in college was in the low third of her high-school © iy 
She had moved each year of her high-school course. Cont! je 


of work and training in reading and study habits made NT. 
a satisfactory college record. Our percentage of нае" of 
such cases, however, is low enough that we do not !* ра“ 
critical scores alone. Some students have the ability ea 
tests which require a few hours but lack the ability to 5" rea 
their efforts over the months. The high-school recoT^ ү 
better indication of this tendency than is the test ee 0 
are, therefore, conservative in the decision, preferring - оп! 


n 1 i " 
admit students unless we fecl that success in college 15 P 
possible but probable. 


The Orientation Test Program pa 

This battery has consisted of the American Council wen р 

cation Psychological Examination for College Fres a“ p 
ermon-Nelson Tests of Mental Ability, the Cooper?" 

glish Test, the Nelson-Denny Reading Test, the боор” 

General Culture Test, and the Bell Adjustment Invent? 


TESTS AT MACMURRAY COLLEGE 63 


the interest of economy, all of the tests are used with answer 
sheets except for the Henmon-Nelson and the Nelson-Denny 
tests which can be scored very rapidly. 

Two mental tests are used in order to have a check one on 
the other. It has been assumed that the higher of the scores 
In these tests is the truer. This assumption was not borne out 
їп а recent study in which we correlated first-semester grades 
With the highest score achieved on an intelligence test. The 
result was about the same for the whole class as is the correla- 
Чоп with either of the intelligence test scores. In a number of 
Individual studies, on the other hand, the higher scores on in- 
telligence tests give a basis for an understanding of the student 
which is lacking without the second test. 

There is some question about the wisdom of the use of the 
Cooperative General Culture Test. It is a severe test for col- 
'ege freshmen. We have now used it for two years: (1) to test 
Its appropriateness for freshmen and (2) to study the growth of 
Students during their first two years in the College through a 

irect comparison with the parallel results in the sophomore 
tests, 

A personality inventory such as the Bell Adjustment In- 
ventory is usually used in freshman tests, but we have reserva- 
tions about its desirability in a required test battery. Some of 
the questions in such an inventory must be quite personal. 

€re is a problem in gaining the requisite frankness from stu- 
ents. There is a further problem, however, which should 
Probably be regarded seriously. The invasion of a student’s 
“ght to privacy concerning personal phases of his life is a very 
real Possibility. In attempting to study the mental health of 
А Young person we must Бе sure that he freely accepts the task 
Which the inventory gives to him. For this reason the instruc- 
tions jn the administration of the Bell inventory have been 
modified, Students have been reminded that they need not 
answer the questions frankly unless they choose to do so; if 
e feel that any question is more personal than they ve m 
edu er, they may omit it or even answer it ler dai y; the 
the ational reasons for the administration of the inventory are 

n *Xplained, and the cooperation of the students is invited. 


64 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


I would judge that such instructions add to the value of this 
inventory. We must dismiss the published norms, but ОШ 
local norms are close to those of the manual. 

Each student is given the opportunity to have the results 
of this test battery interpreted in conference with one of the 
members of the personnel staff. These test reports seem to be 
one of the major techniques of personnel work, particularly 
among freshmen and sophomores in college. A more detaile 
description has previously been published.' 1 

These test reports require from thirty to fifty minutes 
They concern not only the student's scores on the test batter 
but his high-school record, his reading and study skill. 9 
selection of major fields of study, the results of his persona а 
inventory, his adjustment to the campus, and the phun 
by him of a desirable four-year course of study. The коЛ 
of the conference varies with the felt needs of the її! 
student. Students are interested in their scores, in the сое 
parison of these scores with their high-school grades, and "i 
prediction which these scores make possible concerning sue 
in college. The strong points in a student’s equipment 
emphasized without sacrificing frankness. All of these wo” 
ferences are held at the request of the student. More than re 
thirds of all of our freshmen in the past five years bed д. 
quested a test report. This method adds greatly to the 
of the testing program. 


Vocational Tests of 


ay f 

The tests in the freshman battery are chosen primarily 
educational guidance but they also serve as the foun ч m 
the vocational testing program. The sophomore Pu adë 
gram, when available at the time of the vocational 5000 att! 


а samo : m | 
information which is also important. The whole deyelop © s 
e * ә 1 
record of the student, including tests and grades 1” to d 


A . r 
багу school, is assembled as a matter of routine P!!° 
vocational conference m 0. 

А * LS g 
f tes roughly ten hours of testing as а mim pe 
NEBHSR the guidance officer the information available {10 шо” 
1 Dysinger, W, S, «T ' ' as EDUC 
AND PSYCHOLOGICAL Mearvassite IT Ces onal Counseling 


TESTS AT MACMURRAY COLLEGE 65 


instruments. Much worthwhile assistance may be given in 
Vocational planning without the benefit of tests. When the 
tests total substantially less than ten hours, however, the gui- 
dance officer should realize that he is relying on general informa- 
tion and not on test results for his facts and his interpretations. 

The instruments which we use in the first vocational battery 
are the Aids to the Vocational Interview (Record Form B) of 
the Psychological Corporation, the Strong Vocational Interest 
Blank for Women, the Cleeton Vocational Interest Inventory, 
and the Kuder Preference Record. 

The Aids to the Vocational Interview gives the student op- 
Portunity for self-rating, for reviewing avocational, vocational, 
and educational experiences, together with some record of home 

ackground and tentative plans. The other three instruments 
are used for the study of interests and preferences. The whole 
field of the measurement of interests, particularly with refer- 
| €nce to the vocational interests of women, is sufficiently nebu- 
lous to warrant the use of several of these tests. It is our 
Practice to administer not more than one of them on a single 
day, 
| We find that the Strong blank is less useful for women than 
it is for men. It frequently happens that students are high in 
Group V (nurse, office worker, stenographer-secretary, house- 
Wife) and low in the other occupations. Kuder's blank and 
y leeton's blank frequently are instructive where the Strong 
blank is not of much service. . 

Results from these different tests are sometimes contra- 
dictory but are more often supplementary. One could not ex- 
Pect too close an agreement. The norms are based upon dif- 
erent occupational levels and are organized about different 
Occupational classifications. The techniques through which 
Students express their interests or preferences are different, and 
the fundamental comparisons which lie in back of the scores 
themselves vary correspondingly- The results, nevertheless, 
Teach in most cases a substantial agreement. It is a matter of 
dis ortance to the guidance counselor to n the 28 

lons. They may reflect the immaturity oi the student or 


May be а function of the techniques themselves. The real 
Problem is to understand, not to obtain scores. 


66 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


This first vocational battery is frequently not sufficient. 
When vocations in some occupations are included among the 
possibilities, other tests should be added. We use for this рш” 
pose an achievement test in music, the Seashore Measures © 
Musical Talents, the Meier Art Tests, the Minnesota Pape 
Form Board Test, the Minnesota Vocational Test for Clerich 
Workers, the Stanford Scientific Aptitude Test, the Cooperativé 
Literary Acquaintance Test, manual dexterity or mechanic 
aptitude tests, and other tests of special aptitude and achiev 
ment. The choice of these further testing procedures is M? 
during the conference which considers at least the freshman 
orientation battery and the group of vocational interest test 
When this second battery is indicated, another conference ү 
scheduled to consider the results. The follow-up procedures s 
outlined whenever testing seems to have given its full co 
tribution. 

The interpretation of a vocational test battery 
that it is wise for a student to choose a vocation in a field W 
abilities, achievements, and interests are high. The € or of 
interpretation is to relate these results to the requirement à 
different vocations and to the educational program necessa" 
preparation. d 

The vocational plan is by no means independent of the "a 
cational program. "Vocational choice for most young we 
is a problem substantially different from that of you"® «orit? 
The vocations which they are selecting will serve the та) ‘ofl 


assume 
het 


w 


2 


as a source of income for a few years after college 8"? Е. 
as life insurance in the case of tragedy in their homes, as m” 
ground for the many community services which college Td 2 
will perform in the future, as a phase of the developme? ost: 
sense of personal competence which comes from self вир? ip 
For some who will not marry, this choice will turn Hm ion? 
time career. This uncertainty adds complications to voca! 
planning for many young women. cie 
The educational plan for the ablest young women of 5° av! 
seems nevertheless to be clear. Vocations at profession? uc? 
semi-professional levels require the same type of liberal pr 
tion which is most desirable for constructive citizenshiP | 


TESTS AT MACMURRAY COLLEGE 67 


home and in the community. While many young women are 
giving much attention to the vocational problem, frequently 
stimulated by the attitudes of their homes, the educational pro- 
gram of the ablest among them is not so greatly modified. 
Their undergraduate work should be strong in fundamental 
understanding, and this represents the best preparation for 
vocational competence at professional level. 

The purposes which students have in mind in requesting the 
Vocational tests vary rather widely. Some have a definite vo- 
cational plan. They are essentially asking whether the testing 
Procedures offer support to this program. Other students are 
hesitating between two or three possibilities and they hope that 
the test results will assist them in this choice. Other students 
Seem to have no vocational plans, and they ask the tests to 
introduce them to the systematic consideration of this problem. 
All of these groups tend to seek some new suggestion which 
might be made through the tests. : 

A vocational test battery need not lead to an immediate vo- 
Cational decision in order to be valuable. A vocational decision 
ls in reality a series of decisions which may require a number 
of years for completion. Vocational testing is serviceable when- 
€ver it aids the student in making the next decision in the 
Series. This may mean the elimination of possible fields, or 
the consideration of five or six possible fields, or the elimination 
of all possibilities except two or three, or the validation of a 
tentative choice which has already been made. The same test 

attery may be useful at different times for different steps in 

€ series of decisions. It is of importance to the guidance 
officer that he recognize the position of the immediate problem 
this decision-series. 

It is inevitable in so complex a problem that difficulties will 
arise, These involve the personality of the student as well as 
the complexities of modern economic life and the uncertainties 
of uture home life for college women. There are other diffi- 
culties which result from the imperfections of the testing 
‘struments themselves. . 

De of the problems not infrequently met involves the emo- 
‘onal immaturity of some college women. Some students 


68 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


hesitate to make decisions. They prefer indecision and may 
show resistance whenever progress threatens. There are other 
tentative decisions which reflect a similar state of mind. The 
student may insist upon a dream-vocation, and such a plan may 
protect the student against the need for serious planning. Some 
of these dream-vocations are practical enough for a few, but fof 
many students they correspond to the boy’s plan to be a police- 
man. They are signs of immaturity. The need of the student 
whose problem is in this area is growth; time will almost always 
be required. The wisdom of the guidance officer is challenge 
as he attempts to stimulate growth toward maturity throug 
the vocational conferences. 

Other students are vocationally immature. They simpl 
do not know enough about the possible vocations to 
intelligent choice. They frequently have not considere 
matter systematically. It often happens among privilege 
college women that they have not had the work experienc” 
which stimulate thought about future vocations. This sit 
ation may be discovered in the conference. The tests di 
selves may show consistently low levels of vocationally sign 
cant interests. This may be a function of the testing P he 
cedures, or it may reflect the vocational immaturity 9 ' 
student. 

A student may request the vocational tests when 
mediate problems are educational rather than vocation? . pe 
student with a serious reading handicap, for example wil № 
н recalled to the primacy of the educational probes ри 
th чы Rum may be taken in the vocational decision-S€77^. jo 
е S dd rein may require primary emphas! 

When the vocational c ; datio", 
ваа E conference ends with the bee pr^ 
red ional plan, there are a few follow" "d 

з ch need to be attempted. The educational P^ y ch? 
which takes the plan ; prs gee d and 
ne OA ge wea т into account may be en jor? f 
P HE Mai e e or professional training may 2 

of the conference is less definite, the 

procedures represent the next Ы 55 сопс, 4 
the votazione and stop. A bibliograp y t ith 
er study, opportunity for interviews 


im 
her ye 


TESTS AT MACMURRAY COLLEGE 69 


tive workers in the field or with teachers in related fields, try- 
out experiences, discussions with other students, and observa- 
tion of workers in the fields of interest are all resources which 
may be employed. It is usually wise to make a specific ar- 
rangement for a future conference at the time these recom- 
mendations for follow-up are made. This stimulates active 
attention on the part of the student. 


The National College Sophomore Testing Program 


The College Sophomore tests are administered to all sopho- 
mores during the period set by the national committee. The 
Cooperative General Culture Test is one of the most useful in- 
Struments developed for college students. The Contemporary 
Affairs Test is valuable both in the measurement of achieve- 
ment in the area of contemporary life and in the suggestions 
Which it offers in the field of interests. The Cooperative En- 
glish Test adds data of primary importance in education. Since 
these tests must come rather late in the sophomore year, they 
are not available for many individual conferences until the 
Junior year. Profiles of the results are mailed during the 
Summer to all students who make the request. 

One practical difficulty makes it hard to develop a happy 
Campus tradition concerning the sophomore tests. Most of 
the sub-tests in the General Culture and the Contemporary 

ffairs tests are over-timed for many students. The ablest 
Students tend to finish the tests and the least able tend to com- 
Plete all that they can do before the time limit has expired. 
his places the administrator in a dilemma: if he permits 
these students to move about or turn to other matters, he 
;Courages noise and stimulates hasty work by some; if he en- 
Огсеѕ the instructions of the test, he appears arbitrary. The 
Present situation is better than the former practice which per- 
Mitted the extra time to accumulate at the close of the period. 
®t, it is still a handicap to the development of morale in the 
“sting situation. 
üs he sophomore battery is, therefore, in some respects less 
eful than the freshman tests or the vocational battery. It 
“nds itself well neither to the method of test reports nor to 


72 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT | 
amination procedures of the campus, ordering the tests, offering l 
to administer them, and frequently furnishing clerical help in 
scoring. When the department has finished with the test 
blanks, they are filed in the personnel folder of the individua 
student. 


Conclusion 


These testing procedures are phases of the educational pt | 
gram and the personnel work of the institution; they have n° 
independent value. They enable the College to assemble valu 
able data about applicants and about the students. Throug 
the test reports they stimulate students to consider their colleg? 
work systematically, and they are of value in the motivatl? 4 
of these young people. They offer objective data for the St" 
of the academic work of the campus, supplementing person 
observation and impressions with important informa 
Other data are also essential for such work, but the tests “a 
indispensable as technical resources, е: 

The primary objective is educational; the vocation, | 
clearly secondary. Students come with sufficiently intens® ie 
terest in the vocational, motivated in this direction PY n 
homes and often by the guidance programs in the secon” te 
schools. The emphasis which we find most needful is °? al 
educational objectives even in contrast with the vocatio", 
The vision of the educated person, the competent citizen оҳи 
be held before many of these young women as the majo de 
pectation of the college years, This is made less difficult dua 
correspondence of professional preparation in under 
years with the program of liberal education. 


tiom 


THE COUNSELOR AND THE HIGH SCHOOL 
TESTING PROGRAM 


H. C. SEYMOUR 
Board of Education, Rochester, New York 


Introduction 


To unperstanp the testing and counseling program of a 
School system it is necessary to know its setting. Seven of the 
nine high schools in Rochester, New York, are five-year schools 
beginning with grade eight. One is a six-year high school, the 
other a four-year technical and industrial high school for boys. 

hese nine schools serve approximately 14,000 boys and girls. 
The total guidance staff consists of six full-time counselors, 
eight part-time counselors, eight girls advisers, nine boys ad- 
Visers and five psychologists. 

‚ The Superintendent of Schools has delegated to a test com- 
Mittee responsibility for policies concerning the standardized 
resting program. The committee consists of the following: the 

irector of Psychological Services, chairman, the Specialist in 

ests and Research, the Co-ordinator of Guidance Services, the 
9-ordinator of Elementary Education, the Director of Ele- 
mentary Education, one Secondary School Principal, and one 

“mentary School Principal. 

€ committee's responsibilities include: — 
To arrange for regular city-wide testing surveys. 

2. To assist in the selection of tests. 

3. To determine the dates and time when examinations 
are to be given. 

4. To assist school personnel to interpret the results. 

3. To review and pass upon requests for special test 
surveys. 

6. To inform sclioal personnel of developments in test- 
ing and of improved methods of test interpretation. 

73 


74 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The Standardized Testing Program 


The testing program in Rochester is set up on the assump- 
tion that test results at any grade level contribute to the coun- 
selor’s understanding of the development of each pupil. There- 
fore test results for pupils of elementary school age are equally 
as valuable for secondary school counselors as are those ob 
tained after these pupils have entered upon their high-sch00 
program. Often these earlier test results help to explain uneven 
achievement, failure to achieve or the presence of unus 
abilities. In many instances they serve as clues t? pup 
interests. 

Tests of General Mental Ability 


All tests of general mental ability are given by 
psychologists. The results are scored either by machin 
clerks temporarily assigned to the Specialist in ‘Tests an à 
search. The latter is responsible for performing the au 
statistical operations and for the reports which are typ“ ИТ 
returned to each school. All test results are recorde ling 
pupil’s cumulative record for use by members of the couns* 
staff. 


«teal 
the clinic? 
e or 


"m —— id 
Group tests of general mental ability are given 1n che pm t 
e 


of the year to every pupil in grades 3, 5, 7, and 10. Th e th? 
are made available to counselors and psychologists who US" gr 
data for educational planning in the elementary school E" po 
course planning and subject election in the high school. gam 
been the policy in Rochester to repeat the same test at : г ed" 
grade each year. The longer the same test is used, p i т 
continues to meet the specifications set ир by the t as 
mittee, the more familiar are school personnel with a 
and limitations. At present the following tests are use 
Grade 3—K uhlman-Anderson Intelligence Tests 
Grade 5—Kuhlman-Anderson Intelligence Tests 
(tentative) diat? 
Grade 7—Pintner General. Ability Test—Interm? Jog i 
Grade 10—American Council on Education P5y* ii 1! 
cal Examination—High School Form cho? ү 
In addition, each fall prior to the opening 0 to " 
pupils who transfer from private or parochial schools 1 


THE COUNSELOR AND HIGH SCHOOL TESTING 75 


9 are asked to report for preliminary examination. At this 
time the Otis Self-Administering Test of Mental Abilities is 
given. The results are used to help classify these transferees 
so that they may be absorbed smoothly into the school’s edu- 
cational program. 

Seniors who have been specializing in college preparatory 
subjects and who plan to enter a college or university are given 
the American Council on Education Psychological Examina- 
tion—College Form. Rochester has found these test results of 
value when recommending pupils for college scholarships or to 
Specific educational institutions. 

Approximately 2,000 to 2,500 individual mental ability 
tests are given in the Rochester school system each year. Al- 
Most without exception pupils who demonstrate unusual be- 

avior symptoms or who are maladjusted seriously in their 
educational program are referred to the psychologist in the 
school, the only member of the personnel staff authorized to 
administer the Stanford Revision of the Binet Scale or the 
Wechsler Adolescent Scale. The administration of the Binet 
'S requested frequently when group test results for the same 
Individual] differ widely. In each case the psychologist writes 
а Summary report containing the findings and her recommenda- 
Hons. This report becomes a part of the cumulative record 
and is available to counselors when needed. 


Standardized Achievement Tests 


Standardized tests for subject matter content and skills are 
administered by teachers under the general supervision of the 
з) hologists, The latter are responsible for developing a 

table training program for those who give these tests so that 

* Procedure is uniform throughout the city. 
com teading achievement test is given to every pupil when he 
pletes the second grade. Towards the end of the fourth 
ad, Хһ grades the Jowa Every-Pupil Test of Basic Skills is 
Sion Stered. This test provides scores in reading comprehen- 
"s Vocabulary, work study skills, language usage and arith- 
8 br Each teacher records the results on each pupil's profile 

Which becomes a part of his cumulative record. 


76 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


During the seventh grade the reading section of the Jowa 
Every-Pupil Test of Basic Skills is repeated. The results to- 
gether with the scores on the Pintner General Ability Test are 
forwarded to the counselors in the high schools who use this 
data for classification and other guidance purposes. | : 

Shortly after the pupil enters high school a diagnostic 
arithmetic test is given him. In the past the Schorling-Potter- 
Clark Arithmetic Test has been used, but a new and locally 
more appropriate test is now being devised by the high school 
mathematics council. A similar type of test in English for 
pupils beginning grade 9 will be selected in time to use in the 
fall of 1946. The results of these tests will be used by teachers 
to determine whether a remedial program is necessary and if so 

‘what skills should be stressed. 

It might be well to point out here that the results of the 

New York State Regent’s examinations, although not standard- 
ized, are available to the counselors in high schools to help de- 
termine pupil interests and pupil strengths. Those who do not 
take the Regent’s examinations are given comparable tests set 


cbe e : е 
up locally by the Specialist in Tests and Research with th 
assistance of teacher committees. 


Special Testing Programs | 
Testing for Musical Aptitude —The Seashore Psychologica 
Measure in Musical Talent, a test administered and evaluate 


by a full-time professional psychologist, was installed by ihe 
Rochester Board of Educati i 


ments on a trial and 
turnover, 


h 
Modty the folent test Nevertheless pt 
Seashore test gives important information which results 10 ! 


: Ё f Ў аа it 
telligent guidance in music, and it hag become the practice 


THE COUNSELOR AND HIGH SCHOOL TESTING 77 


have all the children tested in the fifth grade, when possible. 
Pupils who are absent at this time or who later transfer into the 
System are given opportunity to take the test at stated inter- 
vals. The fore-knowledge of the child’s aptitudes and weak- 
nesses, if any, gives a valuable vantage point for guidance of the 
child in all branches of music, and saves students and teachers 
from embarrassing experiences. 

Testing for Admittance to the Technical and Industrial 
High School.—The demand for training at this school is so 
great that a special testing program has been introduced to help 
select suitable applicants. After five years of experimental ac- 
tivity with a number of tests, Rochester now uses the following 
battery: 

The American Council on Education—Psycho- 

logical Examination—High School Form 
The Bennett Test of Mechanical Comprehension 
—Form A 
The Arithmetic Fundamentals Test—Form A* 
All pupils who have completed the eighth grade are eligible to 
take this battery of tests. The results are studied by the coun- 
Selor and psychologist at this school along with other data from 
the pupil’s cumulative record. The test results help to de- 
termine whether applicants give evidence of aptitude for in- 
Ustrial or technical education. 
Classification Testing at the Paul Revere Trade School— 
Ochester supports a junior trade school for boys fourteen years 
ol age and over who have difficulty in progressing normally in 
the more academic high school. A very careful analysis is 
Made of each applicant’s qualifications. The Stanford Achieve- 
ment Test. Intermediate Partial Form and the С alifornia Test 
ar ental Maturity (Short Form) are administered at the 
time the pupil enters, to determine the grade level each has 
Ap in reading, arithmetic and general mental ability. The 
teachers i так to classify pupils into groups and to assist 
plan appropriate class work. 


зу The Exploratory Program.—Several high schools in the 
Ee OY have been experimenting with an exploratory program 


1 H 
A test devised locally by the Specialist in Tests and Research, 


78 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


for pupils in the eighth grade to assist them to learn from first 
hand experience what subject areas they should elect in high 
school. The counselors in these schools have been experiment- 
ing with the Kuder Preference Record and the Bennett Test of 
Mechanical Comprehension to supplement the exploratory ex- 
periences. They report that the tests are valuable mainly to 
motivate pupils to think about their high school program in 
relation to their own interests and abilities. 

The Testing Program of the Clinical Psychologists. —The 
Rochester counseling program emphasizes the individual inter- 
view. Every pupil is seen individually at some time during the 
year by a member of the personnel staff. A percentage of them 
are referred to the school psychologist for intensive study. The 
type and extent of tests given in each case depends upon the 
nature of the problem. During the past year the following 
tests have been used in addition to the more common educa- 
tional achievement examinations: the Kuder Preference Rec- 
ord, the California Test of Personality,’ the Bell Adjustment 
Inventory, the Rorschach Test, the Lewerenz Art Aptitude 
Test, the Keystone Telebinocular Test, and the Thematic 
Apperception Test. 

The Experimental Testing Program.—Rochester has been 
somewhat conservative in accepting standardized tests with- 
out experimenting with them. During the last five years ? 
number of tests have been given to determine their validity and 
appropriateness in the local setting. These tests have not been 
given as a part of the regular testing program and the results 
have not been recorded on the pupil cumulative record cards. 
Tests recently under consideration include: the Chicago Test’ 
of Primary Mental Abilities, the Turse Shorthand Aptitude 
Test, the American Council on Education Psychological Ex- 
amination (Short Form), the California Tests of Mental Ma- 
turity, the Purdue Peg Board, the Wrenn Study—Habits 
Questionnaire, and the Minnesota Multiphasic Personality 
Inventory. 


— À 


2 [n Rochester personality tests are not considered appropriate for use in grouP*" 


However they are very often used by psychologists in individual cases. 


THE COUNSELOR AND HIGH SCHOOL TESTING 79 


Conclusions 


Following are the chief assets of the Rochester testing 


program: 


В. 


The testing program is well controlled, well admin- 
istered and appropriately timed for effective coun- 
seling. 

There is ample corroborative evidence of general 
mental ability. 

There are more individual test results than is true of 
the average testing program. 

The musical aptitude testing program is exceptional. 
The results are recorded faithfully upon the pupil 
cumulative record cards. 

There is no evidence of testing as an end in itself. 
Each test has been included to help reveal changes 
in growth. 


No testing program is without its limitations. The follow- 
g improvements should be given careful consideration: 


1. 


The experimental program in vocational abilities 
needs to be accelerated so that a definite city-wide 
program can be recommended for grades 8, 9 and 12. 
The achievement testing program should be strength- 
ened in grades 10, 11 and 12. 

An art aptitude program is needed. 

The results of tests should be translated into stand- 
ard scores to assist counselors and psychologists to 
make valid comparisons between tests differently 
standardized. 


THE SELF-APPRAISAL PROGRAM IN THE PHILA- 
DELPHIA JUNIOR HIGH SCHOOLS 


MARGARET H. WILSON 
Philadelphia Board of Public Education 


IN тне year 1939-40 a group of Philadelphia secondary 
School principals, teachers, and counselors met with the staff 
of the Division of Educational Research to discuss the guidance 
activities in their various schools. A survey made by this 
8roup indicated that, in spite of the efforts of interested teach- 
€rs and counselors, many pupils were expressing career choices 
and, in many cases, selecting high-school curriculums in which 
they would probably not find success. It was evident that they 
‘new little about themselves, about the Opportunities for 
Preparation offered in the city high or vocational schools, or 
35out the possibilities for employment in their chosen careers 
after these courses had been pursued. 

Choices of school and curriculum for the higher schools are 
Made in Philadelphia in the junior high-school ninth grade. A 
“lange іп the homeroom guidance activities for both semesters 
of this final year in the Junior high-school was proposed by the 
‘Vision of Educational Research in the form of “The Ninth 
"ade Guidance Project,” designed to improve and extend the 
ata relative to aptitudes, interests, and social adjustment al- 
ready available to individual pupils. Principals and teachers 
volunteered to attempt a different approach to guidance 
through the use of self-appraisal. In September 1941 the proj- 
as n begun in eight of the twenty-five junior high-schools 
ied Га throughout the school year 1941-42, Close con- 
ia ER the schools and the Division of Educational Re- 
Were hy vind maintained. Materials were prepared, meetings 
Ment S d, and teachers were assisted in their use of measure- 

echniques. At the close of the year, the evaluations made 
81 


82 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


by the principals and teachers who participated led to the de- 
velopment of the present Self-Appraisal Program for September 
1942. 

The Self-Appraisal Program is a guidance project to be en- 
tered into cooperatively by teachers and their pupils. For the 
teacher the program provides opportunities in the use of ob- 
jective techniques for learning about the individual pupil. For 
the pupil the program has three major purposes: (1) the dis- 
covery and examination of his abilities, interests, and social- 
adjustment needs; (2) a study of the world at work with a 
view to a choice of a career area likely to be well-suited to him 
as an individual; (3) the wise choice of school, appropriate 
subject courses, and activities for the tenth grade. 

The launching of the Self-Appraisal Program requires sev- 

eral class periods. The pupils receive a brief explanation of the 
purpose and nature of the program. The measuring instru- 
ments are briefly described as are the charting of the results, 
the plan for study of occupations, the conferences, and the 
choices to be made at the close of the project. At the beginning 
of the program each pupil answers a questionnaire which sup- 
plies a backlog of information for the teacher's use in under- 
standing the pupil’s problems. The questionnaire includes 
items about the home and family, in- and out-of-school ac- 
tivities, educational and vocational plans. The pupil also 
states his career choice and curriculum plans in the form of а 
brief career prophecy. 
After the pupil has obtained an appreciation of the purposes 
of the Self-Appraisal Program and feels some enthusiasm for 
applying its techniques to himself, he is ready to begin a study 
of his own traits. The pupil is probably aware of some trait$ 
such as his general health, his club interests, his ability in leader- 
ship, his skill in getting along with others, and his success 19 
certain school subjects. This information, however, is likely 
to be scattered and very general. 

There are available for all Philadelphia junior high-school 
pupils Scores in a number of tests administered on a city-wide 
basis. These are as follows: in grade 7B, Philadelphia Problems 
in Arithmetic and either the I ntermediate Progressive Reading 


JUNIOR HIGH SELF-APPRAISAL PROGRAM 83 


Test or the Stanford Advanced Language Arts Tests 1 and 2 
(Reading); in grade 8A, the Philadelphia Diagnostic Test in 
Fundamentals of Arithmetic; in grade SB, the Philadelphia 
Junior Test in English Usage; in grade 9A, the Philadelphia 
Verbal Ability Test and the Revised Minnesota Paper Form 
Board Test. 

In addition to these basic tests, three special measuring in- 
Struments are used in the Self-Appraisal Program to provide 
objective clues for self-study. The scores from the six Chicago 
Tests of Primary Mental Abilities show a range of aptitudes. 
The Kuder Preference Record is used to determine the degree 
of interest or preference expressed by the pupil in nine fields re- 
lated to occupational areas. The Washburne Social-A djust- 
ment Inventory helps to supply clues concerning the pupil’s 
Social and emotional adjustment at home and at school. These 
tests are administered by the teacher during the homeroom 
guidance or social-living core period. With the exception of the 
Word-fluency sub-test of the Chicago battery, all test 
chine-scored in the Division of Educational Research. 

The Chicago Tests of Primary Mental Abilities are used as 
Measures of six Separate aptitudes: number, verbal Meaning, 
SPatial thinking, word fluency, reasoning, and memory. The 

irections for administration given by the authors are clear and 
easily followed by teachers, many of whom are not experienced 
11 test administration. Each of the six tests is so organized 
that it can be administered as a unit within a forty-five minute 
Period or broken into its component parts for administration 
uring shorter periods of time, The tests are interesting and 
enjoyed by the pupils. 
After administering the Chicago Tests of Primary Mental 
4 ities to his pupils, the teacher develops with them more ap- 
Preciation of the significance of aptitudes in Career planning, 
in s cusses the use of a profile to show aptitudes and presents 
Which s detail the aptitude section of several sample charts 
„РРеаг in the Handbook for Teachers. The idea of peak 


S are ma- 


oan 15 developed. The pupils discuss the areas of high apti- 
ie ? Shown on the sample charts, the career Plans, and the 
оо 


and curriculum decisions of the boys and girls whose 


/ 


84 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


charts are presented. As a further step in understanding the 
meaning of test scores in their relationship to career choice, 
the teacher and pupils develop lists of occupations that obvi- 
ously require particular aptitudes. This completely impersonal 
method of approach to interpretation is used to prepare the 
pupil for understanding more adequately his own test scores. 

It is obvious to the pupils that additional occupational in- 
formation is necessary before a final career choice is made, The 
measurement program is usually broken into at this point to 
provide time for this study. If the pupil is just beginning to 
think about the working world with several years of school in 
prospect, a general view of the whole range of occupations 1s 
needed in order to help him select fields for exploration and 
further study. If he is at the point of deciding just what vo- 
cational preparation he should make before leaving school, an 
attempt is made to center his attention on quite specific infor- 
mation concerning the few occupations that keenly interest 
him. 

Pupils gather information on jobs from books, pamphlets, 
and technical magazines in the library and from newspapers 
and magazines in their homes. Actual visits to plants, places 
of business, and professional offices afford first-hand contacts 
with work actually being done. Pupils act as reporters tO 
gather information from workers. Class discussions and oral 
and written reports are encouraged. Several schools have 
career forums during which vocational films are shown and 
discussed. In several classrooms teachers have made effective 
use of visual aids in the form of sound or silent films, glass or 
film slides as graphic means of presenting facts about occupa- 
tions. Speakers are invited to talk on careers which interest 
Pupils. Bulletin board displays emphasize graphically occupa- 


tions of particular interest to groups of pupils or those deservin£ 
special local attention. 


| The emphasis in this stud 
ative and pupil- 
pupils do toward 


y of occupations is on pupil-initi- 
activity. Teachers agree that the more the 
discovering for themselves occupations which 
are new to them and the more they learn about the kind of work 
in which they have particular interest, the better prepared are 


227: 


JUNIOR HIGH SELF-APPRAISAL PROGRAM 85 


they when it comes to making their own career and curriculum 
choices. 

After he has begun his study of occupations, the pupil re- 
sumes his program of self-appraisal, this time with emphasis 
Оп interests. Possibly he knows which type of activity he would 
enjoy as a career. He may have difficulty, however, because 
of a variety of interests, in deciding just which types of work 
are most likely to bring him satisfaction. It is possible for him 
to coordinate his vague ideas into a pattern of interests through 
the use of an interest inventory such the the Kuder Preference 
Record. This measuring instrument highlights interests in nine 
areas which are closely related to occupational choice: mechan- 
ical, computational, scientific, persuasive, artistic, literary, 
musical, social service, and clerical. The pupil indicates his 
Preferences by marking which activities he likes most and which 
least. The resulting scores emphasize for him levels of interest 
in each of the categories checked by the Record. | 

By way of preparing the individual pupil for understanding 
the meaning of his own scores, the teacher presents to the group 
the interest section of the sample profile charts referred to pre- 
viously, The pupil checks the peak scores in interests as well 
as the career and curriculum decisions of these pupils whose 
charts are being reviewed. He creates for himself lists of oc- 
Upations involving high interest in each of these nine areas. 

After this preliminary group interpretation of the meaning 
of interest scores, each pupil notes the peak interests identified 
or him. He refers to his list of occupations, to tables supplied 

Y the author of the Record for interpreting scores, and to the 
reverse of the self-appraisal profile chart to lists of workers with 
Similar high interests. He considers which types of work in- 
Volve the combination of high aptitudes and interests which 

is test scores show him to possess. He reviews his career 
Choice and plans in the light of the information he now possesses 
about himself and possibly makes adjustments in his plans. 
е Washburne Social-Adjustment Inventory is used by the 
Pupil with the understanding that the results are completely 
Confidential and will be discussed in an individual conference. 
€cause of the highly personal nature of the responses, a pro- 


86 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


cedure very different from that used with other measuring in- 
struments is followed for the Inventory. The scores are not 
recorded on the self-appraisal profile chart by the pupil. In- 
stead they are totaled and entered on the Inventory profile by 
the teacher who then decides which pupils to interview and in 
what order. 

The Inventory provides scores in happiness, alienation, 
sympathy, purpose, impulse-judgment, and control. In addi- 
tion, a means is provided for checking the frankness with which 
the pupil has responded. The scores reveal levels of adjustment 
from excellent to maladjusted and are used by the teacher as 
clues for discovering and supplementing his knowledge of which 
pupils are in need of help in meeting their problems. The 
teacher makes an effort to interview each of the pupils whose 
scores in one or more areas fall into the maladjusted level. It 
is often true that, because of the very nature of the information 
revealed, the teacher does not use the profile directly in his 
conference with a particular pupil but attempts to get at his 
difficulties less directly. If the interview shows a need for more 
intensive counseling than the teacher has time and facilities to 
accomplish, the pupil is referred to one of the school counselors. 

The self-appraisal profile chart is a graphic means of re- 
cording the results of the measurement section of the program. 
Each pupil makes out his own chart, adding to it from time to 
time as additional scores are available. All entries are checked 
foraccuracy. The top of the chart identifies the pupil by name, 
school, date of first entry to the program, residence, and name 
of his homeroom adviser. Space is also provided for recording 
two career choices, the first made on entry to the program and 
the second as the pupil makes his curriculum selection for 
senior high or vocational school at the close of grade 9B. 

The chart itself provides a cross-section of aptitudes and in- 
terests of the pupil at the time of testing. The aptitudes in- 
clude the results of the six Chicago Tests of Primary M ental 
Abilities and the Scores of tests administered on a city-wide 
basis in successive terms in the junior high school. The in- 
terests are those explored in the nine areas of the Kuder Pref- 
erence Record. A completed profile shows six scores for the 


— 


Е. 


ERE 


JUNIOR HIGH SELF-APPRAISAL PROGRAM 87 


Chicago tests, the six Philadelphia test scores, and nine interest 
Scores, a total of twenty-one measures. 

The results from all tests administered on a city-wide basis 
in the secondary schools of Philadelphia are expressed in terms 
of relative scores which are derived from the distribution of 
Scores of a standard secondary school population. A relative 
Score of 1 (very low) extends from zero to the seventh per- 
centile; a score of 2 (low), from the eighth to the thirty-first 
Percentile; a score of 3 (average), from the thirty-second to the 
sixty-ninth percentile; a score of 4 (high), from the seventieth 
to the ninety-third percentile; a score of 5 (very high), from 
the ninety-fourth to the ninety-ninth percentile. The five 
levels into which the self-appraisal profile chart is divided— 
very low, low, average, high, very high—correspond to the 
relative score levels 1 to 5. 

Instructions to the pupil for preparing and using the profile 
chart are provided on the reverse of the chart. The pupil is 
directed to a table of information concerning the meaning of 
Scores and how this information may be used in understanding 
his profile. On this table the aptitude tests are described in 
Broups so that the pupil can readily locate scores for related 
tests; ie, for number aptitude, the pupil is referred to the 
Chicago Number Test and to the 8A Philadelphia Diagnostic 
Test in Fundamentals of Arithmetic; for aptitude in verbal 
Meaning, to the Chicago Verbal Meaning and Word Fluency 

ests, the 7B Stanford or Progressive Reading Test, the 8B 
hiladelphia Junior English Usage Test, and the 9A Philadel- 
Phia Verbal Ability Test; for aptitude in spatial concepts, to 
the Chicago Spatial Thinking Test and the Minnesota Paper 
orm Board Test; for reasoning aptitude, to the Chicago 
asoning Test and the 7B Philadelphia Test in Problems in 
nthmetic. 
he information contained on the profile chart is valuable 
not only for the pupil as he plans his career, but also for refer- 
“nce by those who advise with him as he progresses. While the 
Pupil is still in the junior high school, the profile chart is used 
ed the Principal, counselor, and teachers as a graphic record of 
© pupil’s aptitudes and interests. At the close of the pupil’s 


88 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


junior high school experience, the profile chart is enclosed with 
other records which are transferred to the vocational or senior 
high school. The profile chart is available for reference in the 
senior high schools by the principal, counselor, schedule makers, 
teachers, or by the pupil who may wish to review his aptitudes 
or interests. 

The self-appraisal profile chart which follows is an actual 
record of a pupil’s aptitudes and interests. It should be under- 
stood that in presenting this chart the author is not attempting 
to indicate an ideal combination of aptitudes and interests for 
a particular occupational area, but rather is illustrating the 
problem of choice which faces a pupil who possesses certain 
traits. 

A study of John’s profile chart reveals considerable aptitude 
in spatial concepts as evidenced from his very high scores in 
both the Chicago Spatial Thinking Test and the Minnesota 
Paper Form Board Test. In number aptitude John’s score in 
the Chicago Nwmber Test is high, but his score in the Phila- 
delphia Arithmetic Fundamentals is only average. His school 
record shows a little better than average achievement in mathe- 
matics. John’s verbal scores in the Intermediate Progressive 
Reading Test and the Philadelphia Verbal Test are high; in the 
Chicag o Verbal Meaning Test and the Philadelphia Junior 
English Usage Tests, average. His scores in the Chicago 
ы Test and Philadelphia Problems in Arithmetic are 
bs average. If these reasoning Scores are correct, John shows 
Р у average ability in working out problems. In the Chicago 
ord Fluency Test and M emory T'est his scores are low. 


Ме ia to the scores on the Kuder Preference Record, 


п = computational interest is very high; his mechanical and 


scientific i Ph. hl Mc s ө 

OC interests, high; his artistic, literary, and social service 
› average; and his { : : 

: ersuas ca 

interests, low. P Ive, musical, and cler! 


Чы sehe the Self-Appraisal Program, John indi- 
time that he was i E as a career choice. At the same 
gaining skill in ж, сы P aptitudes and interests, he was 
close of the В al drawing. When it came time at the 

erm for him to express a second career plan, he 


ye 


JUNIOR HIGH SELF-APPRAISAL PROGRAM 89 


PROFILE CHART SELF-APPRAISAL PROGRAM OF GUIDANCE IN ree Western JUNIOR. HIGH SCHOOL 
PUPIL’S MAME _ Моћ DATE OF FIRST enter $A-Sept: 1943 
resioence_ 162 M. Broad Street __ aisee Miss Crawford _ 
career pans 1- Mechanical _ Engineer |z- Drafts man 


TENTH GRADE SELECTIONS. scnoor farkway Nigh cureicucunMechanic Arts ___ 


INTERESTS 
R PREFERENCE RECORD 


Е 


c 
5| 
im 


л wo p 
er B21 3 feo] L fi о 2 
oF кору е [= $ 
5 x2 223| È [Sul з 
eg zs] $ |25) = |23] 2 d 
Rz =5 OF] ЕЛЕ: = 
RE 59 3 8 | Sia) se З 
бо тоо | do 


B 
5 


8 


5 


ЖИП 


ДЇК НИ ҤЕ 
E UO 10 


Peer veer Buceo ignora ror roe 


90 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


had decided upon drafting. John may make a "m 
changes in his life planning as he progresses through the se 
igh school. е 

A: aptitude in spatial thinking forecasts possible pipe 
in either of the careers that John has selected. His high interes 
in mechanical, computational, and scientific activities рсе зи 
he would probably enjoy working in cither area of his pc 
or in a related occupation. If the high verbal meaning or 
are correct measures, John shows better than average aptitude 
for understanding written material. If, however, the € 
verbal meaning scores are correct indicators of his ability 5 
handling written material, John may have some difficulty s 
doing the amount and quality of reading and writing ie epum 
for college engineering. If it is true that John has on Р, т 
ability in number and reasoning, there is some doubt E ‚з 
being successful in advanced mathematics and physics whi 
would be required in courses preparatory to engineering. 4 

John and his parents considered the machine design e: 
construction and patternmaking curriculums in a vocation? 
school but decided that he should enroll in the mechanic im 
curriculum in a senior high school. This is an academic ui" 
riculum with a sequence in shop and mechanical drawing. Bs 
John is able to complete the mathematics and physics of К 
mechanic arts curriculum successfully, he will be able to арр J 
for college entrance. In any event, with the shop and mechant 
cal drawing, as well as the other subjects of this cone 
John will have quite adequate preparation for entrance int 
drafting, his second career choice. — 
The culmination of the program is the career-planning 10 
terview. Tt is essential to have mutual understanding on the 
Part of home and school concerning the pupil’s abilities ап 
interests and the opportunities for their development afforde 
in particular curriculums in the senior high or vocation 
schools. Every effort is made to include the parent in this 

ce which gives to parent, pupil, and teache? 

cussion of the profile chart, the occupation? 
probable success, and the problems which 1€* 
on of home and school to the needs of thé 


areas of greatest 
quire the attenti 


чы > 


JUNIOR HIGH SELF-APPRAISAL PROGRAM 91 


pupil as he passes from one school level to another. Out of this 
interview come the final decisions as to school and subjects for 
the tenth grade. 

The present Self-Appraisal Program has evolved as a result 
of continuous evaluation by those who have worked closely 
with it. In addition to frequent informal evaluations, oppor- 
tunities are afforded term by term for organized appraisals. 
Each 9B pupil, his adviser, the principal, and the staff member 
administering the program in the school is asked to express 
his opinion of the value of different phases of the program. 

The results of a follow-up study of the 1941—42 group of 
Pupils show that there has been an apparent reduction of one- 
third in failure and one-half in dropout of the pupils who ex- 
Perience this program as compared with their contemporaries in 
the senior high schools. There is evidence from pupils’ state- 
ments that during their self-appraisal they learn many things 
about themselves that help them make better adjustment in 
Senior high or vocational school. 

The use of the Self-Appraisal Program in his school is op- 
tional on the part of each principal. The growing feeling that 
all Pupils, rather than a few, have a right to experiences in self- 
appraisal has led many schools to extend the program to all 
Classes in a grade. The one year of appraisal activities as pro- 
Vided for in the 1941 plan has been extended in most schools 
to two years, including grades 8A to 9B. The result has been 
а Constantly expanding program which has grown from six 
undreg twenty-four pupils working with sixteen teachers in 
eight Schools in 1941 to eighteen thousand, one hundred twenty- 
in Pupils and four hundred four teachers in twenty-five schools 
In 1945 


In 1941 the single weekly guidance period was used for self- 
aPpraisal, With the increased interest in guidance activities 
Оп the part of teachers and pupils came the realization that a 
arger portion of school time should be devoted to what seemed 
к * an important part of the pupil's school experience. At 

. Present time nearly all of the participating schools 
Sing two Periods weekly for self-appraisal. 
rom a number of teachers have come requests for more 


are 


92 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


time in which to interview pupils and their parents and a 
suitable place for these conferences. Many teachers are using 
all of their preparation periods, are coming early and staying 
late to crowd in conferences so that all pupils can have the 
opportunity of talking over their career plans. Е 

While most of the schools use the program as describe " 
some change it to suit their special needs. One school with a 
large incoming 9A group for which no aptitude Scores p 
available has chosen to reverse the order of test -—' —— 
using in the 8B grade section the Kuder Prejerence "m. 
and reserving until 9A the Chicago Tests of Primary Menta 
Abilities. Two schools open their programs by the use of the 
California Personality Test as a means of discovering for а 
pupils self and group adjustment clues early in their junior hig 
school experience. In one school a school-work group is using 
an adaptation of the program worked out by the бани Ба 
the teacher, and the pupils. In two elementary schools in И 
eighth grade classes аге retained, pupils are beginning in 8 
the analysis of their aptitudes and in 8B are learning e 
about the world at work. These pupils will continue their self- 
appraisal in the junior high schools to which they will be 
transferred for the ninth grade. 

The Self-Appraisal Program has been instrumental in focus- 
ing attention on a more thoughtful career and curriculum choice 
for the senior high schools. Junior high-school principals, 
counselors, teachers, pupils, and their parents have expresse 
satisfaction at having available more objective data on which to 
base these choices. The clues provided by the adjustment 11" 
ventories have helped many teachers to understand the! 
Pupils’ problems. At the same time the pupil-parent-teache 
conferences have assisted many pupils in making a better 
adjustment concerning their home and school problems. : 

It is the earnest desire of those who are working closely with 
the Self-Appraisal Program that it remain flexible and vitally 
alive, that it challenge attention to the need for extending time 
allotted for guidance, that it continue to increase teacher undef- 
standing of pupils’ Problems, and finally through teacher-pupi!- 


Parent conferences that it knit more closely the ties betwee? 
home and the school. 


ha е 


NS = 


| 
M P 
i 


THE PRACTICAL ADAPTATION OF COUNSELING 
AND TESTING TO AN INDUSTRIAL SCHOOL 


JOHN O. HERSHEY 
The Hershey Industrial School, Hershey, Pennsylvania 


CouwsrLoms and others active in school personnel work 
have at their disposal a vast amount of varying types of ma- 
terials for use in their respective programs. The true value of 
these materials rests in their application to situations for which 
they are suitable and in their practical adaptation to these 
Situations, The Hershey Industrial School has tried to keep 
this Principle in mind in its selection and use of guidance 
Materials, 

Like all schools The Hershey Industrial School has its own 
Problems, some entirely unique and others merely unique in 
Part. The school located in Hershey, Pennsylvania, was 
founded by the late Milton S. Hershey to provide a free edu- 
Cational program which would prepare orphan boys for suc- 
cessful, productive citizenship. To render this service for these 
Unfortunate youth an elaborate program was developed to 
8've hundreds of boys a home, vocational training, and assis- 
tance jn beginning the art of earning their own living after 
Saving school. Recognizing individual differences and the 
Need for training in various occupational fields, the educational 
Program of the senior high school, grades 10, 11, 12, gives all 

9ys an Opportunity to choose one of the following training 
“Partments: academic or college preparatory, agriculture, auto 
ae anics, baking and candy making, commercial, electricity, 
созы shop, plumbing and heating, printing, sheet metal and 
ing, and woodworking. 
uch a program in the senior high school, therefore, necessi- 
emphasis on vocational guidance in the junior high school 
Prerequisite to placement in one of these training depart- 
93 


tates 
as а 


94 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ments. The problem of giving adequate and reliable p 
becomes especially important in view of the early age at which 
vocational choices must be made, and in view of the fact that a 
major misplacement would be costly to an orphan who may 
never have sufficient funds or an opportunity to change his 
vocation conveniently and secure proper training in a different 
field after leaving the Hershey School. А 

To provide the boys in the Junior High School with a foun- 
dation for choosing wisely a course of training, the guidance 
service extends a three-fold emphasis at this level—namely, 
vocational information, pupil evaluation, and counseling. | 

Vocational information is made available in ап арат 
manner опе period each week to the pupils of grades eight E 
nine. "Occupations I" for grade eight deals with the кна 
world of work and the various types and levels of a ee 
endeavor, while “Occupations II” in grade nine gives the pup! 
an opportunity to narrow his study to any three specific oc- 
cupational areas. For example, one boy in grade nine ps 
study literature on (1) woodworking, (2) social service, an 
(3) store clerking, while another boy at the same time pn! 
study (1) electrical occupations, (2) sheet metal work, апо 
(3) engineering. The use of a mobile occupational ра. 
in the class room makes possible a workshop with a personalize 
approach to occupational study that permits each pupil to 
prepare for his vocational choice in the light of his own interests 
and aptitudes. Field trips, movies, and general library facilities 
supplement this study approach. | 

Pupil evaluation calls for the appraisal of aptitudes, 10" 
terests, and various aspects of personality development ар. 
social adjustment. Неге again there is an effort on the part 9 
the school to make practical its selection and use of guidance 
materials. In choosing test materials the counselor had to giV* 
consideration to such factors as the age and sex of pupils to be 
tested, the qualities to be measured or identified, the practic? 
aspects of administration, the interpretation and use of che 
tests and their results, and the degree of reliability and validity 


1“A Mobile Occupational Lib. » 0 hi onal Guidant 
Journal, XXIV (1945) 9]. nal Library. ccupations, The Vocationa. 


INDUSTRIAL SCHOOL COUNSELING AND TESTING 95 


of the tests available which would meet the recognized needs of 
evaluation. 


The group testing program now in operation for all of the 


junior high pupils can be classified as follows: 


(1) 


(2 


— 


(3 


м 


(4) 


(5) І 


Mental Intelligence 

Otis Quick-Scoring Mental Ability Tests, Beta A and 
Beta B, by Arthur S. Otis. 

The Chicago Tests of The Primary Mental Abilities, 
by L. L. Thurstone and T. G. Thurstone. (Measures the 
mental abilities of number, verbal meaning, space, word 
fluency, reasoning, and memory.) 

Academic Achievement 

Stanford Achievement Test Advanced Battery—com- 
plete, Form H, by T. L. Kelley, G. M. Ruch, and L. M. 
Terman. (Tests paragraph meaning, word meaning, lan- 
guage usage, arithmetic reasoning, arithmetic computation, 
literature, social studies, elementary science, and spelling.) 
Mechanical Aptitude 

Test of Mechanical Comprehension, Form AA, by G. 
K. Bennett. 

Revised Minnesota Paper Form Board Test, Series AA 
and Series BB (a test of spatial relations), by R. Likert 
and W. Quasha. 

Industrial Training Classification Test, Form A, by C. 
H. Lawshe and A. C. Montoux. 

Commercial Aptitude 

Detroit Clerical Aptitude Examination, by H. J. 
Baker and P. H. Voelker. (Tests rate and quality of hand- 
Writing, rate and accuracy in checking, knowledge of simple 
arithmetic, motor speed and accuracy, knowledge of simple 
Commercial terms, visual imagery, rate and accuracy in 
Classification, and alphabetical filing. 

Nterests 

Kuder Preference Record, Form BB, by G. Frederic 
uder, (Identifies degree of interest in the following fields: 
mechanical, computational, scientific, persuasive, artistic, 
Iterary, musical, clerical, and social service.) 


96 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


(6) Supplemental Testing 
In addition to the regular group testing program, a 
number of other tests are administered either to groups or 
to individuals when the presence of specific problems 
warrant their use. 

The real value of the testing program lies in the interpreta- 
tion of the test results and in their practical use for the student 
himself as well as for the teachers, administrators, or others 
who should know of the appraisals of a particular pupil. To 
make more meaningful the test results to the students and to 
all others concerned the Director of Guidance has developed 
a system whereby all test ratings or scores are transposed into 
ratings of A (representing the top 10 per cent), B (representing 
the next 20 per cent), C (representing the middle 40 per cent); 
D (representing the next lower 20 per cent), and E (repre- 
senting the bottom 10 per cent). Figure I, a section of the 
cumulative guidance record, shows how these results are 
recorded for use in vocational counseling and for reference 


throughout the senior high-school years and at the time of job 
placement. 


SUMMARY CHART OF TEST RESULTS 


5 A 
aw LILIVA 
Solid linetvepresents standing with standardized norms. Doled line c grade rank 


Figure I. 


At the time each pupil receives his counseling interview dealing 
directly with the interpretation of the test results he is give? 
a blank summary form similar to that of Figure 1, accompanie 

by a set of instructions and questions. The pupil then writes 
S HS own ratings as the counselor gives him the results ОЛ 
the Various tests, It should be noted that each pupil is rate 

in relation to his own group, as well as with the standard norms! 
This counseling interview aims to help the pupil to apply e 
results of his test experiences in a practical and fruitful Уау: 
He is now in a better position to choose wisely his areas 0 


| 


| 


INDUSTRIAL SCHOOL COUNSELING AND TESTING 97 


occupational study as well as his final selection of vocational 
training. 

Another phase of student appraisal is the subjective evalu- 
ation of the personality and character development and social 
adjustment of each pupil by the teachers, house-parents, ad- 
ministrators, and others who have intimate contacts with them. 
The summary of these appraisals is then used in counseling 
With the pupil regarding identified problems or tendencies 
toward undesirable behavior patterns. These results also have 
value in rating boys for scholarships, awards, job readiness, and 
the like, 

The emphasis of this discussion has dealt chiefly with the 
Practical aspects of the guidance program as it relates to voca- 
Чопа] choice and preparation. Other phases of the guidance 
Service of the school also have been developed, such as the ori- 
€ntation program, remedial procedures for specific types of 
Problems, general problem counseling, placement, and fol- 
low-up, The school is attempting to keep the entire guidance 
Program an evolving one that calls for the continual evaluation 
and adaptation of new materials, techniques, and procedures 
3$ а source of enrichment for progress in the quantity and 
Quality of its service. 


| 
| 


4 


USING TESTS IN A SMALL SCHOOL SYSTEM 


GEORGE SPACHE 
Horace Greeley School, Chappaqua, New York 


In ATTEMPTING to describe the testing program of a school 
System it soon becomes apparent that the uses, interpretation, 
and even the very kinds of tests chosen are affected by the 
School's philosophy and the quality of its faculty and adminis- 
tration. One cannot merely enumerate the measures used 
Without explaining the reasons for their choice from among the 
Éreat mass of tests available. These reasons evolve from the 
School's own concept of its role and responsibility to the com- 
munity, Thus the reader will find intruding upon the descrip- 
tion of the testing program some discussion of the school's 
Philosophy. If, as we conceive of it, testing is an integral part 
9f the school's functioning, then this discussion is justifiable. 


Kindergarten 


Reading readiness tests of aptitude, which are in effect 
Measures of coordination, attention, verbal facility and con- 
Ceptual background, such as the Monroe’, or the Alice-Jerry, 
and the Pintner-Cunningham intelligence test are used among 

"ndergarten children. The intelligence test is given to all 
ndergartners to aid in deciding upon the advisability of en- 
tering the first grade. As Gates has shown (1), there is no 
PPtimum age for school entrance but the child's success in 
ne to read is largely dependent upon the methods of in- 
Tuction. Therefore we do not establish any mental age asa 
anne quisite for the first grade but depend upon the teacher’s 
ae and reports on ability, adjustment and maturity in 
> Junction with the psychologist’s observations and the 


Intelli 
Atelligence test results. 


*An alphabetical list of tests and publishers is appended. 
99 


100 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


First Grade 


Intelligence testing is extended to include school entrants 
without kindergarten experience in order that some measure 
of ability is available to the first grade teacher to help in form- 
ing and correcting her impressions. Achievement testing iai 
in February with the Detroit Word Recognition, a measure o 
word and phrase reading aided by pictures, and the Pressey 
First Grade Word Reading Test, a test of three sections, a) 
word recognition among grossly dissimilar words, b) о 
recognition among words of similar initial letters, and с) wor 
recognition in terms of meaning. These provide some informa- 
tion as to the breadth and techniques of the child’s early efforts. 
They are followed at the end of the school year by the Metro- 
politan Achievement Tests, Primary I which includes three 
tests of reading, word and phrase reading aided by pictures, 
word and phrase recognition among grossly similar words and 
‘word meaning in terms of definition. These give similar indi- 
cations to the earlier tests. Included also in the battery is a 
Numbers test, a simple arithmetic measure. This we have 
revised and adapted in terms of the newer de-emphasis upon 
formal arithmetic at this level. Our adaptation omits the 
addition and subtraction combinations and revises the remain- 
ing 52 items in terms of reading, writing, counting, vocabulary; 
time and money. 


Second Grade 


Cunningham is employed with new school 
gh maximum scores are rather frequently 
5 level among our pupils. Growth in funda- 
assayed by the Metropolitan Primary II bat- 
des reading tests of word, phrase and paragraph 
word meanings based upon recognition by 
means of definitions, arithmetic tests of fundamentals and 
problems and a spelling test. We pay little attention to the 
actual grade scores achieved in any of these primary achieve- 


ment tests. Grade Scores do.not indicate the child’s actual level 
but rather the perf Р 


The Pintner- 
entrants althou 
achieved at thi 
mental skills js 
tery. This inclu 
comprehension, 


* sdb" 


TESTS IN SMALL SCHOOL SYSTEM 101 


for example, that grade scores on this battery match actual 
reading levels in this fashion: 1.6 to 2.0 equal to pre-primer to 
Primer, 2.0 to 2.5 to primer to easy first reader, 2.5 to 3.0 to 
average first reader, 3.0 to 4.5 easy to difficulty second reader. 
Because of the artificiality of grade scores we tend to prefer 
those locally constructed tests which, in the opinion of the 
teaching staff and administration, adequately sample the facts 
and concepts taught. Upon occasion, we do use the Gates 
Primary Reading Tests, Type 1 as a measure of word recogni- 
tion, Type 3 asa general test of reading comprehension, and the 
Dolch-Gray Word Attack Test as estimates of the child's word 
analysis skills. 


Third Grade 


, The Pintner-Durost Elementary Test, an intelligence test, 
18 used here since it provides two results, one dependent and 
Опе independent of reading skill, both of which are compatible 
with classroom and clinical observations. Subject matter 
8towth in arithmetic is measured by a diagnostic fundamentals 
test constructed by the writer which evaluates performance in 
all of the steps and processes taught during the year. Reading 
Érowth in rate, comprehension and word meaning is evaluated 

Y locally standardized tests based on the basal and other 
Teading materials. Spelling is measured by tests drawn from 
random sampling of the text in use. Formal record of achieve- 
ment is made through the Stanford Achievement Tests at the 
£nd of the school year. 


Intermediate Grades 


End of the year standardized achievement testing is con- 
ued through these years using the Stanford or Metropolitan 
àtteries despite the false interpretations often made of their 
rade scores and the lack of suitability of most of the tests. 
"cally standardized measures of reading rate, comprehension 
a Word meaning, diagnostic tests of arithmetic fundamentals 
wade analysis are used at varying times. The sixth 
E 15 conceived of as a time when definite preparation should 
given to enable the pupil to carry on the self-directed work 
acteristic of the junior high school. Therefore, the Jowa 


tin 


102 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Work Study Skills Test is used early in the year to assay skill 
in map reading, the use of dictionary and index, and familiarity 
with basic references. The Peabody Library Information Test 
is also used as a gauge of the pupil’s readiness to work inde- 
pendently and to use educational tools. 

We are not satisfied with the intelligence testing in these 
grades by group verbal tests. The Otis S-A, Otis Quick-Scoring 
and Kuhlmann-Anderson have been used at various times but 
like most group tests, these are open to the criticism that they 
measure intelligence insofar as it can be estimated through the 
medium of reading. The California Test of Mental Maturity 
which makes an attempt to overcome this situation has not, 10 
our experience, demonstrated its validity when compared with 
the individual clinical tests. We hope to have the opportunity 
of experimenting with group use of non-verbal tests such as the 
Kohs, Porteus, Goodenough, etc., and such group tests as the 
Pintner General Ability Tests, Non-Language Series in the 
hope of finding a combination that will indicate potential aca- 
demic ability without the obscuring influence of academic 
performance. . 

Throughout the elementary grades an intensive attempt 1S 
made to understand each child's abilities and limitations. This 
is secured by individual testing, using the Stanford-Binet, Kohs 
blocks, Porteus Mazes, or the Wechsler-Bellevue, by careful 
explanation of the nature, purpose and interpretation of each 
test and by conferences. The latter are held, as the occasion 
arises, among several of the following: the psychologist, the 
teacher, the parents, and the principal. Annual promotion 
conferences, at which each child is discussed, are held by the 
principal with the present and future teachers in attendance: 
Weekly conferences of the psychologist and teachers are hel 
to discuss subject matter, methods, or the implications of the 
most recent tests. 

Testing is conceived of as an instrument to reveal the na- 
ture, abilities and disabilities of each child. We are interested 


in knowing as accurately as possible each child’ 
academic ability and his development in th 


during this crucial foundation period. Hence 


s potential 
e essential skills 
, we have for the 


ge SS 


TESTS IN SMALL SCHOOL SYSTEM 103 


greater part avoided the artificial grade scores obtained from 
many tests selecting only those few that clearly analyze de- 
velopment in a particular skill which is of significance, and 
utilizing locally made tests which more adequately cover the 
Subject matter field than the commercial test can. This is 
evidenced in our use of tests of word recognition, word analysis, 
rate and comprehension of a body of continuous material, etc. 
rather than depending upon the common measure of ability to 


read isolated, unrelated paragraphs, called “reading compre- 
ension" tests. 


Junior High School 


We have discarded most achievement tests at this level 
€cause of the lack of ceiling and discriminatory power. Co- 
erative Tests of Community Affairs, Social Studies for 7-9, 
eading Comprehension and Mathematics for 7-9 are used, 
9Wever. All show good discriminatory power and the sub- 
test results are compatible with other indications. 
We are still looking for a suitable group intelligence test 
other than those dependent upon reading, but we have had to 
“pend upon the Wechsler-Bellevue, Kohs and Porteus. In 
these grades, curricular differentiation is begun between verbal 
and Manual-minded types of children based upon test results, 
teachers? opinions and school grades. Advanced courses in 
“op, mechanical drawing, etc., are offered where indicated and 
9n-regents and non-academic curricula are planned for in- 
Viduals. In conjunction with the seventh grade social studies 
Program, groups alternate in exploratory courses in shop, art, 
Music and home economics, each child spending an equal time 
the These laboratories, as they are called, are keyed to 
tee studies units but the vocational implications are also 
a a In the ninth grade social studies one unit is devoted 
i ue Sindy of vocations. Here the Kuder Preference Record 
ee as an introductory step. In addition, cases in need of 
en mqa training in reading, speech or arithmetic are selected 
Ividual or small group instruction. 


Senior High School 


Beginning in the ninth grade, the American Council on Edu- 


104 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


cation Psychological Examination is used for intelligence test- 
ing, the high school edition in the ninth and tenth grades, the 
college edition in the eleventh and twelfth grades. This too 15 
a highly verbal examination and penalizes those with poor 
verbal facility or foreign language background. However, it 
does serve to point out those likely to experience difficulty with 
highly verbal or mathematical areas. We use it to advise cur- 
ricular choices, considering the linguistic section to be related 
to English and foreign language success, and the quantitative 
to science and mathematics. It does not serve to point out 
those who could succeed in non-academic curricula but this has 
been fairly well established already by individual testing and 
school history. 

Subject matter growth is measured by the Cooperative tests 
of science, social studies, foreign languages and mathematics 
supplemented by teacher-made tests where necessary. Reading 
ability is judged by the Cooperative Reading Comprehensio” 
Test and the Social Studies Abilities Test and remedial work 
in this area based on these. 


Vocational and Educational Guidance 


Beginning largely in the tenth grade, an attempt is made 
to help the student clarify his thinking about his vocational oF 
educational future. In addition to the conferences with his 
home-room advisers, the dean or the psychologist, tests 0 
interest and aptitude are used. The opening measure 15 the 
Kuder Preference Record where the scores in nine major fields 
of endeavor are interpreted to the pupils as indicating t б 
similarity of their interests to people working in these fields- 
Pupils scoring high in the mechanical area are asked to take the 


Minnesota Paper Form Board Test, the Bennett Mechanic? 
Comprehension Test 


pe and the MacQuarrie Test for Mecham? 
ility. „The Bennett is interpreted as a measure of the UP 

derstanding of simpl 

an understandi 


- 


TESTS IN SMALL SCHOOL SYSTEM 105 


automotive and airplane mechanics, machinist, bench hand, etc. 
The MacQuarrie is interpreted as a measure of simple mechan- 
ical ability. Both local and authors’ norms are used in these 
interpretations. 

All students scoring high in the clerical section plus all 
commercial curriculum students are asked to take the Cardall 
Primary Business Interests Test and the Minnesota Vocational 
Test for Clerical Workers. The Cardall points out the similar- 
ity of the pupil's interests to five general types of clerical posi- 
tions, a discrimination the average student fails to make. The 

innesota is a general measure of clerical ability and again 
Permits discrimination among types and kinds of work. In 
addition, the Bennett Stenographic Aptitude Test and Turse 

horthand Aptitude Test are being used in the hope of securing 
Critical scores for admission to these courses. 

, Pupils scoring high in art or music interest on the Kuder are 
Biven the opportunity of taking the Meier Art Judgment Test 
or Seashore Measures of Musical Talent Tests to secure some 
stimate of their potential abilities in these fields. 

П conjunction with the state-required course in mental 

Ygiene, the Neher Health. Inventory and the Johnson Tem- 
Peramen, Analysis are used to give the pupils some objective 
Measures in these areas. 

S the programs of the dean and psychologist become more 
Y coordinated and defined, we hope to increase the voca- 
l testing to include such instruments as the O’Connor 

Inger Dexterity, and Tweezer Dexterity, the Minnesota Form 
sees а mechanical assembling test and to broaden the 

Ysis in art and music. 
вд uring the past year, a faculty committee undertook a 
hint, of the attitudes and socio-economic backgrounds of the 

Schoo] Population by using the Wrightstone Scale of Civic 


Close] 
tona 


aa and the American Home Scale. Interrelationships of 
weli ligence, cultural background and attitudes were studied as 


К 33 the influences of age, grade, and sex upon attitudes. 
Popul ata Provided considerable information about the student 
ation, in the opinion of the committee. 


106 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Remedial Work 


Remedial teaching is carried on by the psychologist and 
several other members of the faculty throughout the system. 
Training is given in regularly scheduled classes in peg s 
arithmetic, spelling and speech. Dramatics are carried on : 
the usual extra-curricular manner but are also used as an a 
tivity and means of expression for pupils of good non-academi 
ability or those needing speech assistance. | — 

Numerous measures are used in evaluating the си е. 
of individuals chosen for remedial help. In addition to 
usual group tests, which provide the initial selection, many 3» 
dividual tests are given these pupils. In reading, the Binocu * 
Reading Test (2) devised by the writer, the Eames € xin 
Jensen eye tests, the Durrell battery, the Gray's Oral, an ni 
Stone Narrative are given to almost all. Abilities of pam 1 
children are also evaluated by the Dolch Basic Sight Wore 
Test, the Dolch-Gray Word Attack Tests and a phonics ae 
ventory of the writer’s. These provide some indication of i 
degree of visual coordination and its influence upon pu 
the extent of visual defects, simple estimates of oral and silent 
recall, sight word recognition, oral and silent rate and maer 
standing, as well as the child’s knowledge and use of — 
techniques. These indicate the particular emphasis of = 
remedial work while informal tests are used to find the ар 
propriate levels of this work. . 

In spelling, the phonics inventory and the writer’s pel 
tests of Mispronunciation, Spelling Rules and Spelling n 
are employed. These indicate the relative influence of e 
pronunciation upon misspelling, the knowledge and use of ru ic 
and the types of errors. In arithmetic, we use our diagnos” 
test of fundamentals which parallels the state syllabus, as н" 
as the Buswell-John, Brueckner and Wisconsin Inventory Tes st 
In reasoning, we are attempting to formulate an analytic pel 
which will aid in differentiating among reading ability, num 


B + B ED E а; a P 
concepts, arithmetic reading ability and skill in fundament 
as causes of difficulty, 


Records and Reports "m 
= . " 
We utilize a cumulative record form for the primary 812 


ИЕ Ер и 


TESTS IN SMALL SCHOOL SYSTEM 107 


which emphasizes adjustment, behavior traits and personality 
characteristics. Throughout the elementary and junior high 
schools, a marking system of S, S— and S+, signifying satis- 
factory is used. The child’s own ability is used as a standard 
rather than the achievements of the group and a child is not 
enied promotion because of lack of academic performance. 
9n-promotion is used largely with under-age, immature 
Children, or under-age children of low mentality, i.e., in those 
instances where there is definite reason to believe that the 
child would benefit by repetition. Reports to parents are de- 
tailed and informal. They emphasize the child’s adjustment, 
Progress in proportion to his ability and effort, as well as his 
relationship to the work of the group and the grade. 

n the senior high school, the usual per cent and letter marks 
and honor rolls are used. Like most high schools, we have not 
ound a marking system which might supplant this and still 

* acceptable to all persons concerned. 


REFERENCES 


1. Сагез, А “ Mental Age for Beginnin 
rthur I. “The Necessary Mental Age ginning 
Reading.” Elementary School Journal, XXXVII (1937), 
497—508. : з 
s Spache, George. “A Binocular Reading Test.” Journal of Edu- 
., ational Psychology, XXX (1943), 368-372. 
Ibid “One-Eyed and Two-Eyed Reading.” Journal of Educa- 
tional Research, XXXVII (1944), 616-618. 


TESTS 
Amer; 


ican Council Psychological Examination. New York: Coopera- 

Чуе Test Service А 

1 еніса Home Scale. Chicago: Science Research Associates. 

Bp ee ular eading Test. Chappaqua, N. Y.: George Spache. . 
есе Diagnostic Test in Decimals. Philadelphia: Educational 

Br, °St Bureau. | . . 

"еске, — Test in Fractions. Philadelphia: Educational 

Califo, t Bureau, 


ta Tests of Mental Maturity. Los Angeles: California Test 


Carg jj °ач | | 
д Pr "тату Business Interests. Chicago: Science Research 
OClates 
"os. o ч | . 
Поета Tests. New York: Cooperative Test Service. 


C Ord Recognition. Yonkers-on-the-Hudson: World Book 
9mpany. 


108 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Dolch Basic Sight Word Test. Champaign, Ill.: Carrad Press. 
Dolch-Gray Word Attack Tests. New York: Scott, Foresman. : 
Durrell Analysis of Reading Difficulty. Yonkers-on-the-Hudson: 
World Book Company. 
Eames Eye Test. Yonkers-on-the-Hudson: World Book Campani 
Gates Primary Reading Tests. New York: Bureau of Publications, 
Teachers College, Columbia University. | ег 
Goodenough Measurement of Intelligence by Drawings. Yonker 
on-the-Hudson: World Book Company. +. Schall 
Gray's Oral Reading Check Tests. Bloomington, Ill.: Public Sch 
Publishing Company. — 
Iowa Every-Pupil Tests of Basic Skill. New York: Hought 
ifflin. МГ 
Johnson Temperament Analysis. Los Angeles: California 
Bureau. 
Kohs Blocks. Chicago: C. H. Stoelting. : 
Kuder Preference Record. Chicago: Science Research Associate 
Kuhlman-Anderson Intelligence Tests. Philadelphia: Educati 
Test Bureau. lifornia 
MacQuarrie Test for Mechanical Ability. Los Angeles: Cali 
Test Bureau. New 
Mechanical Comprehension Tests, G. К. Bennett and D. Fry. 
York: Psychological Corporation. logical 
Meier Art Tests—Part 1, Art Judgment. New York: Psycholog 
Corporation. World 
Metropolitan Achievement Tests. Yonkers-on-the-Hudson: 
Book Company. logical 
Minnesota Test for Clerical Workers. New York: Psycholog 
Corporation. | da 
Minnesota Spatial Relations Test. New York: Psychological Corp $ 
ration. 


Neher Health Inventory. Los Angeles: California Test Bureau. { 
O'Connor Finger Dexterity Test. Chicago: C. H. Stoelting. | 
O'Connor Tweezer Dexterity Test, Chicago: C. Н. Stoelting. the- 
Otis Self-Administering Tests of Mental Ability. Yonkers-on- 
‚ Hudson: World Book Company. dson’ 
Otis Quick-Scoring Mental Ability Tests. Yonkers-on-the-Huds 
World Book Company. 


i ; st 
eabody Library Information Test, Philadelphia: Educational Те 
ureau. 


Pintner General Ab 
, the Hudson: World Book C . Hud- 
Pintner-Cunningham Primary Mental Test. Yonkers-on-the- r 


Jd 
Yonkers-on-the-Hudson: Wo! 


T ‚у Соте 
d Е ical 
poration, nd Revision. New York: Psycholog 


r 9 
TESTS IN SMALL SCHOOL SYSTEM 10 


Pressey First Grade Word Reading Test. Bloomington, Ill.: Public 
т.г йы ааа Test. New York: Psycho- 
AS Ee xag a New York: Houghton-Mifllin. " 
eading Readiness Test Based on Alice and Jerry Books. New York: 
Seashore n of Musical Talent. New York: Psychological 
Stanford атп, nt Tests. Yonkers-on-the-Hudson: World Book 
se cinis Tests, G. K. Bennett. New York: Psycholog- 
Bor „е... Tests. Bloomington, Ill.: Public School 
Tests э QUIE CODI Visual Acuity and Astigmatism. New 
Durs, oe a o onm а World 
rightstone Seda} Civic Beliefs. Yonkers-on-the-Hudson: ш, 
Wechsle БОРАУ ышсенсв Scale. New York: Psychological 
Orporation, 


S teno 


: i l Publish- 
Wisconsin Inventory Tests. Bloomington, Ill.: Public School Pu 


Ing Company, 


bo 
3 


PSYCHOLOGICAL TESTING IN RELATION TO 
EMPLOYEE COUNSELING 


HELEN PALLISTER! 
Washington, D. C. 


Measurement in Personnel Work 


IN any organization, the fundamental psychological fact 
ot ea pen rice. on ET cem hand, p = 
Signmen lon o many pro ems o Ш erentiation Of wor. d 
So-call 22 On the other hand, such differences are a factor in 
Be; » т personnel problems. Examples of such problems 
id: ability to perform the job, failure to keep the required 
own n the work, boredom, too much preoccupation with one’s 
; Personal problems, inability to get along with one’s work- 
and ейте, resentment of supervision, lateness, absenteeism 
ow morale. 

Th П some cases, interviewing of those concerned in the prob- 
b е 4 reveal its cause, so that a satisfactory adjustment can 
Upon, е. In other cases, interviewing will not nage facts 

iviq ich to effect a genuine solution. For example, an in- 
assi al who complains of his work assignment may be re- 
пеј to another kind of work, only to return later to the 


1 
logie. The ideas advanced by the writer have developed as a result of her psycho- 
Personn jaming and experience in relation to the experience she has obtained in 
42, n. Work in the Federal government. Upon entering the government in July 
vision’ Spent several months in the Test Construction Unit of the Examining 
"Peu: Following this experience, 


ЕТ i oe : n 
е Un Service Commission. 
ate coe ] work on the transfer of personnel 


ed out by The Civil Service 


of i 


8 


as ; = 
Cong viSsigned for a few months to practica 


аг 3 " r 
MMissign US Federal agencies, as it was then carri 


first Employee Counselor at the Civil 


Sepe wri 
Tice (iter was inted as the г 
ae i then appointed 1с the definite need for a comprehensive 


tese: о, = a c 
la progran SOn. In this capacity she fel 

е Я 
Menta] рге accepting a pi 


a osition in the Training Section of the Division of Depart- 
Dros. * erso 
i el of the Department o 
Ject Н р 


Е State, the writer assisted for a few weeks on 
ut i i ] in still another agency. 
Made 16 con lving planning for the testing of personne gency. 


tent of thi icularly to the application that could be 
i» Ee. B of this er relates parti 
ME felt пр Ок а рана. ОТК of the Federal government. However, 
the concepts discussed can be adapted and applied elsewhere. 

111 


112 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Personnel Office with more complaints. Eventually the con- 
clusion may be reached that something in the personality 
structure of this employee determines his inability to adjust to 
his work. In many cases, the personnel officer is frankly at a 
loss to diagnose specifically the cause of the particular personnel 
problem with any degree of certainty at all. 

The need for accurate measurement in personnel work has 
been apparent for years. Given the impetus of testing in the 
First World War, various organizations have employed psy- 
chological tests for the selection of their workers. The Federal 
government, itself, in the examinations of The United States 
Civil Service Commission, offers an example of the use of tests 
for selection. These examinations comprise tests of general ОГ 
special ability and also tests of achievement. Personality fac- 
tors relevant to the suitability of an individual for various kinds 
of jobs have not been measured by the examinations, nor, 10 
many instances, have they been assessed at all. 

Besides the Civil Service testing, there has also been some 
other testing carried out within the government. In some 
agencies employees have been placed in accordance with test 
results or have been upgraded by means of tests. Also, €^ 
ployees enrolled for in-service training courses have sometimes 
been tested before, and, or during the course. d 

The Federal government has, however, a long way to 80 m 
employing tests to the maximum of their usefulness. While 
the armed forces have used tests systematically in a variety 9 
situations requiring the accurate measurement of personne? 
the government has, as yet, made use of only a small segment js 
the total sphere of psychometric applications. In differen 
agencies testing programs have varied considerably both 22 
their extensiveness and in the psychological training of those 
in charge of the programs. 4 

Furthermore, there is no standard method of record-keePi"® 
Therefore, when a Federal employee transfers from one agen 


. . i 4 
to another, his test records, if such exist, do not follow hi? 
they would, for example. 


, 4 : dd 
; if he were being reassigned 1” 
army. 


Some of the broad personnel problems at present requiri? 


— 1 — 


———————————— 


Pay for it 


PSYCHOLOGICAL TESTING 113 


the use of psychological tests in the Federal government are: 
the emotional and vocational adjustment of returning veterans, 
the adjustment of employees separated by a reduction in force 
or downgraded due to the operation of veterans preference, and 
the adjustment of employees reassigned due to changes in the 
Pattern of functions performed by the agency in peace time as 
Contrasted with war time. 

A comprehensive testing program could, through objective 
Measurement, assist in the maximum utilization of personnel, 
the reduction of personnel problems, the raising of morale and 
the reduction of the cost of personnel administration. 

During wartime there was considerable emphasis on the 
maximum utilization of personnel. However, in order to utilize 
Personnel to the maximum, the characteristics of the personnel 
Must be accurately assessed. Without the kind of data that 
Psychological testing provide, many mistakes are made in as- 
Signing personnel, in upgrading them or in other ways handling 
them effectively. Personnel should, of course, be utilized fully 
not only during wartime but also during peacetime, if the 
Peace is to endure. 

any personnel problems can be reduced through a testing 
or gram that enables personnel officers to base personnel action 
i Measured individual differences. If personnel are placed and 
сава іп accordance with their abilities and personality char- 
parent elm 
mak; work under supervisors c є he i 
Probie” for effective supervision, and are x be per 

€ms that becloud their working capacity, there should be 
ue ei reduction in individual personnel problems and at the 

* time an increase in group morale. 

testing program can also help to reduce the cost of person- 
administration. At the present time there is a waste of the 
‘Payers money in the form of mistakes made in personnel 
Ministration, because of the lack of psychological measure- 


e à : 
2 B has shown that a testing program can 


Ince industry , 
self, when it is properly conducted, there is already a 


ес 
“dent for the government to follow. 


Ne] 


114 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The Nature of Employee Counseling 


Employee counseling is one of the newer personnel func- 
tions, particularly in the Federal government. Therefore, 2 
brief exposition of it is appropriate, before the relationship of 
testing to counseling is discussed. Departmental Circular No. 
439 of The United States Civil Service Commission, entitled 
Employee Counseling, contains the following statement: 
“Counseling may be defined . . . as an organized approach to the 
solution of individual employee problems which affect their 
general morale, efficiency, and productivity, the purpose being 
to assist management in maintaining a degree of stability in 105 
working force necessary for the fulfilment of its operating 
responsibilities.” 

The counselor’s working day is spent in a variety of func- 
tions. In some agencies every new employee is given ап 19- 
duction interview by a counselor. The system of exit inter- 
viewing set up to interview those separated from the service 
for any reason is also usually handled by a counselor. Em- 
ployees who come voluntarily to the counselor or are referred 
to her by their supervisors are also interviewed regarding their 
problems. If the problem happens to be one of work adjust- 
ment, and also sometimes in other instances, the counselor 
usually confers with the supervisor and, or with, other personne 1 
officers concerned. Such contacts are necessary, but must Бе 
handled with discretion, since a counselor is bound by a code 9 
professional ethics not to violate the confidences of her cou?" 
selees. An interpretation to the counselee of the need for Suc 
contacts must frequently precede the actual contacts. 
counselor also has the responsibility of interpreting to manag?” 
ment the interests and needs of employees. She likewise ma!" 
ries contact with community organizations that can service 
earl cess aha de meed for ee 

| Moss а P ealt LN, child care, etc. cher? 
should be a free ae 9 pe personnel functions, “ious 
sections af the P ange of information among the va 

ersonnel Office. The counselor will need t°, 


informe Sane «og ae 
d about placement activities, in-service training, Je 


classification and about the keeping of personnel records. The 


PSYCHOLOGICAL TESTING 115 


rest of the organization, particularly the personnel office and 
top management will also need the general findings of the coun- 
selor as an aid in making their decisions as to policy and 
Practice. 

The counselor’s knowledge should, of course, not be limited 
to that of the various personnel functions. Rather, she needs 
a broad knowledge of the functions of the whole organization 
upon which to project the various individual problems that are 


brought to her attention. 


The Place of Testing in an Employee Counseling Program 


It is apparent that even in an ideal agency having a com- 
Prehensive testing program, not all of the problems that con- 
front the counselor can be solved through the assistance of 
testing. Problems on which testing would, in general, not 
urnish pertinent information are housing, transportation, legal 
aid, child care, financial need and health. In analyzing the 
Батис of her problems, the counselor should dichotomize the 
Problems into those on which measurement of the individual’s 
characteristics will furnish information pertinent to the solution 
of the problem and those on which such measurement 1s 
Probably irrelevant. О . 

It should be realized, of course, that since the individual is 
а tota] functioning organism, there may be an interrelationship 
among severa] problems, some of which seem to require mea- 
Surement for their solution and some of which do not. For 
example, an individual who gets involved in a series of financial 
involvements, such as the nonpayment of debts, might thereby 
reveal 4 personality maladjustment which would be susceptible 
о Measurement, and which, upon further investigation, might 
€ Shown to bear an indirect, if not direct, relation to the 
dividual’s work adjustment. E 
_ There are problems of major importance both to the indi- 
Vidua] employee and to management for the genuine solution 
9! which testing is essential. Such problems include vocational 
adjustment, in-service training, and education, with emotional 


a : i 
nd social adjustment cutting across them. 


uring the time that the writer spent in employee counsel- 


116 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ing at The Civil Service Commission, 14% of all counseling 
cases were concerned directly with vocational adjustment.” 
The breakdown for the various kinds of problems within this 
category is as follows: 


TABLE 1 . 
Nature of the Problem. Percentage 
Request for reassignment ....................... 34.1 
Request for promotion à г. „ее sers en ононе esse 30.5 
Request for transfer to another agency ........... 15.6 
REST ETIATION: “оао вано ени ыы када дыг vam 6.3 
Appeal of efficiency rating ...................... 3.6 
BMibtelkgpUME. „миледе мә accus ag ag IR FEM Ve Hale 9.9 


The table above shows that the two problems most fre- 
quently encountered by the counselor in the area of job ad- 
justment were a request for reassignment and a request for 
promotion. A great variety of reasons was given in different 
cases for the desire to be reassigned. In some instances the 
employee objected to the way the supervisor allegedly treated 
him. In other cases there were complaints about the variety 
or lack of variety in the tasks imposed, the fact that the indi- 
vidual's alleged skills were not being fully utilized, the physical 
conditions of the job or the characteristics of the employee's 
working associates. 

In handling a case requesting reassignment, the employe® 
counselor interviewed the employee, eliciting from him the 
story in his own words, supplemented by whatever questioning 
was necessary in order to clarify hazy details of the account: 
Whenever the employee agreed to the counselor's contacting 
his supervisor, such a contact was made in order to obtain the 
supervisor's estimate of the employee's efficiency and othe" 
pertinent factors, such as his personality adjustment as it was 
reflected in his work habits or dealings with his associates. 

In many instances, the employee volunteered informatio? 
about his interests, abilities or skills that he thought would fit 


him for another kind of job. Data from a testing program, 1" 
cluding tests of i 


personality wou 


icate the total extent of the problem of уос e 
to f^ 


PSYCHOLOGICAL TESTING 117 


solution of problems of reassignment, particularly if norms had 
been set up for different occupational groups and for different 
levels of grade within the same occupation. 

Since the agency had no such testing program for its per- 
sonnel it was frequently necessary to accept the employee’s 
estimate of these variables. A check was made on the ratings 
that the employee had obtained in Civil Service examinations, 
but such information was never the complete answer to a prob- 
lem of reassignment, since the employee had not taken a battery 
of tests that would have furnished a comprehensive picture of 
him as an individual. 

The writer recalls the case of one employee who was re- 
assigned several times, each time only to return to the counsel- 
Ing office with a somewhat different story of friction between 
herself and her supervisor. Here probably was a personality 
Problem that might have been detected very much earlier, per- 

арз through the use of one of the personality tests that have 
Proved their worth in the recent army testing programs, or 
ап adaptation of such tests. 

In the case of requests for promotion, the need for a com- 
Prehensive testing program is equally obvious. Since in Federal 
Positions the line of promotion frequently leads the individual 
into a supervisory or administrative position, the ability of the 
Individual] to plan and organize work and to get others to work 


armoniously together would need to be tested even more than 


18 mere technical skill in a particular operation in which he 
ars training has been 


as been engaged. While in recent yea! 
Куеп to supervisors on job instruction, job methods and job 
“lations, no comprehensive attempt has been made to measure 
Zeke ‘alities as supervisors. Here is a 


: Widuals for their potenti 
commendation that a scientifically minded counselor might 


We 
ll make to management. 
f Course, some employees are promoted from one non- 


“UPervisory position to another such position. Even in this 
Case, however, nothing is known in most agencies of the ceiling 
the individual's ability in a particular kind of work nor of the 
minimum ability or pattern of personality characteristics 
Needed for different levels of jobs. In counseling employees 
“questing a promotion, such information is definitely needed. 


118 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Testing is also applicable to other kinds of job adjustment. 
Whether an employee should be encouraged to transfer to an- 
other agency or should be deterred therefrom depends upon his 
measured characteristics in relation to the needs of the agency 
where he is employed. It is true in most agencies that the 
counselor with test scores available at a counseling interview 
would be able to approach a discussion of transfer much more 
realistically than at present. Such data might also be relevant 
to a discussion of at least certain elements on the employee’s 
efficiency rating form. 

Test results are also pertinent to a request by an employee 
for in-service training. A case comes to mind of an employee 
who had completed a regular course of training on a special 
kind of typewriter required in the work of a particular section 
of the agency. The employee came to the counselor with the 

' complaint that the supervisor criticized her for failure to main- 
tain an adequate output in her typing work. The employee 
requested further training on this typewriter. The problem 
here was whether the employee had the potentialities ever t° 
become a proficient typist on this machine. In the absence 
of measurements on the employee, she was granted permission 
to re-enroll in the course. The training officer, who was con^ 
sulted by the counselor about the employee's progress, was 
dubious about the employee's ever reaching a high degree ? 
proficiency. If minimum levels of ability for entrance into 
training for various kinds of work were definitely known ! 

terms of test measurements, both the taxpayers money 2” 

the employee's time could be saved in first screening out those 
candidates for in-service training who, in all likelihood, woul 

not be able to develop the required degree of skill. 

Some employees approach the counselor with the problem 
of assisting them to plan for their further education. They 
frequently do not know for just what kind of work they would 
like to fit themselves, but are eager for a more definite under- 
standing of their own interests and abilities. If test results 
were on record for employees a program of education in relatio” 
to a realistic future vocational adjustment could be much better 
planned than is true at present. While counselor at the Сім 


PSYCHOLOGICAL TESTING 119 


Service Commission, the writer referred cases of this kind to 
the Public schools for testing. However, since the testing 
carried out in the schools was not extensive and since the test 
results were not related to the requirements for different kinds 
of jobs in the government, their usefulness was necessarily 
limited, 

Problems of emotional adjustment come to the counselor’s 
attention in various forms. Sometimes an employee visits the 
counselor to discuss ways in which to develop greater self- 
confidence. Such a clear recognition of the need for a better 
emotional adjustment is probably rarer than cases in which the 
individual projects his emotional difficulties onto others or onto 
the work situation. The counselor’s office is a place where an 
individual can, through a free expression of his feelings, develop 
Insight into the nature of his difficulties. 

Af, for each employee, one or more measures of emotional 
adjustment, as well as measurements of abilities, interests and 
Skills, were available, such data could be used to supplement 
the insight that develops out of non-directive counseling. Even 
cases where such measures would not be discussed directly 
With the counselee, the data would still be a valuable source 
of information for thé counselor and any other personnel officers 
Concerned in the solution of the problem. 
. The results of personality tests would also be a clue to the 
hat the individual would be likely 
to make. Problems of various kinds brought to the counselor 
could be seen in relation to the social adjustment of the indi- 
Vidua], An employee who would be likely to bea misfit because 
ius too seclusive, too aggressive or too extreme in other 

cts of his behavior could be known 1n many instances on 
a Measured scale in relation to the other personnel of the or- 
8anization, Of course, complete reliance could not be placed in 
s, Cases on such test scores, since it is possible to falsify re- 
P in on written tests of personality. However, the writer 
M that if employees are motivated to answer the ques- 

f the test results in a genuine at: 
:n accordance with their personality 
f replies will be kept to a minimum. 


120 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


From the discussion above, it is possible to generalize about 
the contribution of a testing program to employee counseling. 
First there is an advantage in the diagnosis of a problem. It is 
possible to define many problems more objectively and with 
greater speed and precision when test results are available than 
when the individual concerned has not been submitted to 
measurement. 

A second advantage is in prognosis. With a battery of test 
scores available on an employee, it is possible to predict, again 
with greater objectivity, speed and precision, the extent tO 
which a satisfactory adjustment to any particular problem is 
possible. If an individual's measured characteristics are very 
much out of line in any way with the needs of various jobs in 
the organization, it is unlikely that he will ever make a very 
satisfactory adjustment there. Individuals with severe per- 
sonality defects or with an intelligence level below that of the 
minimum grade at which they are willing to accept employment 
would be examples of cases for which the prognosis of a 500 
adjustment is extremely unfavorable. s 

Thirdly, since counseling interviews also serve a therapeutic 
function through the development of insight on the part of the 
counselee, the results of a testing program can be used to permit 
the employee to see himself better in relation to other person" 
nel of the agency than he probably can without these measure" 
ments. The reliability of the counseling interview will thereby 
be increased. Of course, no blanket rule can be laid down about 
the release of test results to a counselee. What is important 
that the interpretation of test scores be made to individuals fof 
whom it will enhance their degree of insight into their own 
problems. 

Fourthly, test results can be used in some situations as the 
basis for recommendations to management. Counseling aim? 
not merely to assist in the solution of a series of individu? 
problems, but rather, through a study of the pattern of thes 
problems to recommend policies that will remove the causes 0 
the problems. Therefore, test results can serve to substitut? 
accurate measurement for hunches in the elimination of maY 
problems. For example, if it is found that in a certain secti" 


PSYCHOLOGICAL TESTING 121 


of àn agency, a number of individuals complain of monotony in 
their work and the results of testing show a uniformly high in- 
telligence level, such a finding might point to the advisability of 
recommending a lower intelligence score as the ceiling for 
employment in that section. 


Practical Aspects of the Testing Program 


Since employee counselors are seldom trained psychologists, 
and since testing requires psychological training and experience, 
E S evident that the counselors themselves should not be per- 
mitted to administer the testing program. This statement in 
ПО way reflects on the ability of the counselors. It merely 
Means that general counselors should realize that they cannot 
Use technical instruments in the field of human measurement 
апу more than they can engage in the direct solution of medical 
oF legal problems. 

he Setting up of the testing program will depend upon a 
number of factors that affect the particular agency. Some of 
these factors are: the size of the agency, the availability of 
"nds, the kinds and levels of jobs within the agency, the kind 
i; PPointment, whether temporary or long-term, character- 

8 the personnel, and the degree of enlightenment of top 
Management. | 
carn, 21 agency is relatively small, it ought s be cw e 
E. ТУ out more individual testing than would be possi е - 
cop nization having thousands of employees, provided, o 

urse, that funds are available at all for a testing program. 
Cat believes that, in view of the value of a pp 
2:18 program as demonstrated in the armed. orces, an a 
Mstrator who is really convinced of the necessity of scientific 
ta; surement of his personnel can go a long way Were 
Stone the necessary funds through a presentation of the pas 
nm Plishments of a testing program. А А 
€ tests that are selected for the various test batteries 
dui Course, bear a direct relationship to the requirements 
€ adjustment of the personnel within the agency, par- 
ularly With regard to their jobs and their fitness for training. 
Precursor to the actual selection of even a tentative test 


124 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


examples of problems for the solution of which psychological 
measurements in the form of test scores should be available. 

7. A testing program can aid counseling in diagnosis, prog- 
nosis, therapy and also as the basis for certain recommen- 
dations of the counselor to management. 

8. The testing program should be carried out by psy- 
chologists. 

9. The testing program should be of sufficient scope 50 that 
the various sections of the Personnel Office could utilize 105 
findings for the solution of many of their problems. m 

10. In addition to mass testing of all employees, some indi- 
vidual testing might be necessary at the request of the employee 
counselor in order to throw light on individual problems brought 
to her, for the solution of which the testing already carried out 
does not furnish assistance. 

11. Those in charge of the testing program shou 
every effort to validate their results and in other ways to ? 
the program as significant as possible for their parti 
agency. 


ld make 
make 
cular 


REFERENCES : 
1. Altus, W. D. and Bell, Н. M. “Validity of Certain Measure, 
of Maladjustment in an Army Special Training Cente 
Psychological Bulletin, XLII (1945), 98-103. ий 
2. Barron, М. Е. “The Emerging Role of Public Employee ca 
seling.” Public Personnel Revue, VI (1945), 9-16. Em- 
3. Dreese, M. “Guiding Principles in the Development of an Ш 
ployee Counseling Program.” Public Personnel Revue, 
(1942), 200-204. “1, delphi? 
4. Grinker, R. and Spiegel, J. P. Men Under Stress. Philade!P 
Blakiston, 1945. Multi; 
5. Harmon, L. R. and Wiener, D. N. “Use of the Minnesota ^ +” 
phasic Personality Inventory in Vocational Avice 
Journal of Applied Psychology, XXIX (1945), 132-1 Gi il 
6. Pallister, H. Employee Counseling at the United States 
Service Commission, December 24, 1942-December "tate? 
Unpublished study on file, Reference Service; nited 
Civil Service Commission. 
7. Remmers, Н. Н. “Psychology—Some Unfinished Bus 
Psychological Bulletin, XLI (1944), 713-724. seling 
8. Rogeris C. Ra, “The DM of Ínsight in ih d үш 
elationship. ournal of Consulting Psychoros7* 
(1944), 331-341. J oa 


ines5' 


PSYCHOLOGICAL TESTING 125 


9. Rogers, C. R. “Psychological Adjustments of Discharged Ser- 
Fo cmd Psychological Bulletin, XLI (1944), 
10. Schmidt, Н. О. “Test Profiles as a Diagnostic Aid: the Minne- 
sota Multiphasic Inventory." Journal of Applied Psychol- 
11 ogy, XXIX (1945), 115-131. 

- Shartle, C. L. “Occupational and Vocational Counseling of 
Military and Civilian Personnel During the Period of 
Post-War Demobilization and the Years Immediately 

12 , Thereafter.” Psychological Bulletin, XLI (1944), 697-705. 

* United States Civil Service Commission. Departmental Circular 

No. 439. Subject: Employee Counseling. October 27, 1943. 


PROJECTIVE TECHNIQUES IN A NEURO- 
PSYCHIATRIC HOSPITAL 


JULES D. HOLZBERG 


Mason General Hospital, Brentwood, New York 


Ж “a PSYCHIATRIC setting providing diagnosis and treatment 
a igen patients, the psychiatrist must concern 
үз with the total human being, including both the physical 
ctun ental aspects of the individual. То obtain an accurate 
а I of all of the elements that comprise the total personality, 
ак» of necessity turn to other specialists for assistance, 1.€., 
Urologists, laboratory technicians, social workers, psychol- 


Ogists . j 
‚ етс. At Mason General Hospital, the Army's largest 


ne , Уа : 
, "ITOpsychiatric hospital, the psychologist 1s one of the special- 
le in assisting the psy- 


ls 
йы has played an important ГО. Ae 
Positio with the process of diagnosis, treatment an BH 
PSychia, through the use of psychological test data. he 
areas T has used the clinical psychologist to explore certain 
that : patient activity in order to clarify certain questions 
as ie have concerning the patient. Thus the раа 
gence ed upon the psychologist to evaluate the m 
ls py and to answer specifically such questions as: М E 
Oes een level of intellectual activity? What relations ip 
lectual is level bear to his optimal level? What qe intel- 
viden abilities and disabilities does the patient possess! What 
ce is there of intellectual impairment OF deterioration? 


OWeve activities of the psychologist have not been limited, 
Г, to an exploration primarily of intellectual functioning. 
by the psychologists at this 

а al аге seen in order to clarify certain questions concern- 
traits ёг personality status, 1-€-, hat are the patient’s bas 
What md characteristics? What are his chief preoccupations! 
evidence is there that would indicate bizarreness, dissoci- 


127 


128 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ation, anti-social trends, or emotional instability? What are 
the latent personality trends of the patient and what are his 
basic personality patterns? Even further, the psychologist is 
called upon to answer certain questions regarding the psychi- 
atric classification to which the patient belongs diagnostically. 
To the questions raised about the patient's intellectual 
activity, the psychologist can give relatively valid responses by 
using standardized global intelligence tests, such as the 
Wechsler-Bellevue Scale (7) or the Wechsler Mental Ability 
Scale Form B (Army). The Wechsler has been used here to 
give more than an intelligence quotient or mental age. It has 
been interpreted dynamically in terms of scatter analysts; 
quality of response, and test behavior. Where gross impair- 
ment exists, this test answers questions pertaining to the 
personality and to the diagnostic status of the patient. 
. However, with many patients very subtle distinctions are 
involved in determining personality and diagnostic status. This 
makes it necessary to bring more subtle and sensitive instru- 
ments, such as projective techniques, into play. On the basis 
of a quantitative and qualitative analysis of the Wechsler, the 
psychologist may make certain hypotheses but these may 167 
main unconfirmed. However, when the Wechsler is bolstere 
by confirmatory evidence from more sensitive techniques, the 
probable accuracy of the psychologist's judgment is increase” 
Frequently the psychiatrist will ask for psychological signs 
of psychosis. Again, the Wechsler may not be sufficientY 
sensitive to detect these psychotic signs because the patient 
may be maintaining superficial contact with reality. Номе" 
the use of more sensitive instruments such as projective tech- 
niques may show indications of psychosis that have been misse 


projection sheets of the sentence-completion type (6, 5). T 
battery, brought into practice by pressures of work and limite 
number of psychologists, has proven to be extremely valua 


PROJECTIVE TECHNIQUES 129 


in the study of neuropsychiatric patients primarily because it 
gives weight to both the intellectual and personality aspects of 
à patient's condition, and also because it combines controlled, 
standardized techniques with less standardized but more subtle 
Projective techniques. 
The question frequently asked is why projective techniques 
are more subtle, more sensitive. By definition, a projective 
technique is one the purpose of which is not apparent or obvious 
to the subject. There is relatively great freedom in responding 
to stimuli. There is no direct questioning of the subject’s be- 
àvior. Projective techniques are instruments which are rela- 
tively less structured and consequently give the patient suffi- 
cient area in which to freely wander and express abnormal 
Mental processes which are not readily observable in structured 
tests, 
The Bender Visual Motor Test has been an integral part of 
pur battery and our use and interpretation of this technique 
ауе been based on the monograph published by Bender (1), 
Оп the manual prepared by Hutt (3), and on the experiences 
With this test at our hospital (4). The theory behind the use 
of this instrument is that any deviation from the norm, as with 
mentally ill people, will be revealed in visual-motor ed 
ich deviate from the designs which the patient 18 instructe 
^ Copy. The test is not statistically standardized, and ee 
* used mathematically or mechanically. It has, ane Я ae 
Proven useful in approaching an understanding of the inte E 
Bence, personality and diagnosis of the patient. With menta 
E d that the drawings resemble those 
eater variability in the quality of 


Children wi wever, £ 
n er, gr i 
ith, ho ә din children’s reproductions. 


© Producti : lly foun 
н uction than is usually 10U^7 77; : : 
Ith Organics, certain specific distortions of the designs will 


Accompany specific organic states. Generally, however, this 
Category of patients will display, among the distortions shown, 
Petseveration of errors, loss of ability to analyze into parts, and 

© Presence of auto-criticism without the ability to correct 
trors, Schizophrenics may show dissociation or reflect the 
actua] splitting of gestalts, regression, fragmentation, loss of 
Tectional orientation (rotation), and elaboration leading to 


130 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


bizarreness. Psychoneurotics generally show no basic distor- 
tions in the gestalt configurations in this visual motor function. 
However, their productions may display infantile reactions, 
showing regression but not to the level of the psychotic. In 
addition, their drawings may. be small in size and crowded into 
a relatively small portion of the paper. Frequently there will 
be some verbalization of insecurity. 

Another projective instrument utilized at this hospital is 
the sentence-completion projection sheet, copies of which are 
appended at the end of this paper. These have been useful in 
gaining access to the present thought content and feelings of 
the patient. Attitudes toward the Army, the degree of group 
feeling that the patient possesses, feelings of rejection, rationali- 
zations, projections, and compensations are among the many 
things reflected by this technique. Psychoneurotics will fre- 
quently reflect their anxieties, guilt feelings, presence of insight; 
vague and specific complaints, and the presence of coherence 
and good control. Schizophrenics usually display remoteness, 
bizarreness, dissociation, unrelatedness and preoccupation. The 
psychopath will frequently show an absence of conflict, a direct, 
primitive, impulsive approach, and a lack of social identification: 

Another technique utilized with our patients is the drawing 
of a man and a woman with pencil. These drawings have been 
utilized as a projective technique in order to permit the patient 
to portray his conception of human form and content. This 
technique has been especially useful in probing the phantasy 
elements of withdrawn and seriously blocked patients. It has 
also been useful in detecting the presence of deterioration 
Through this technique, it is possible to gage the patient's use 
of space, body relationships, use of shading, and other elements 
relating to his conception of the human body. Depressed py 
tients have been found to produce small drawings with ve 
meager use of space. Obsessives have shown a meticulousness 
of detail by filling in drawings or over-drawing the outline 
Psychotics show poor form conception and lack of insight 25 ki 
a proper evaluation of the various parts of the body. Sexual 
disturbances will be reflected by discrepancies in the handling ? 
the male and female figures, the treatment of the genitals an 


PROJECTIVE TECHNIQUES 131 


Deu areas of each of the figures. In studying mental 
em Ma the drawings are frequently of differential signifi- 
tives n in the case of patients who function as mental defec- 

n the Wechsler but who show a high level of phantasy 


Mii e in their drawings. 
everal abstracts of actual case examples illustrating the 


use of these techniques in the study of psychiatric patients 
follow: 


Case 1: 


is On the Wechsler, this patient performed on the average 
vel on those tests which tend to hold up against deterioration 
and on the defective or borderline level on those tests which 
are more sensitive to deterioration. The original endowment 
appears to be average. His Bender-Gestalt drawings show 
marked regressive qualities, resembling the productions of pre- 
School child or a low-grade mental defective in all respects, 
Le. perseveration of looped-forms for dots, horizontal lines 
rotational and not parallel, great difficulty with angulated and 
Crossed forms, concepts as series and masses rather than as 
absolute number and size. His man and women are drawn 


With similar facial appearances, even (0 elaborate eye-lashes 
and ears reversed; arms and legs are primitive in construction 
—perhaps more bizarre than infantile—vwith the impersonal 
clothing of both figures drawn with some care and attention to 
accuracy; however, marked perseveration is evident in clothes 
detail. “Although the father of two children, he shows a strong 
identification with his mother on à regressive level on the 


Sentence-co i . “Jf my mother . . . were here with 
mpletion sheet y à 
Psychological study 


» 
id “My best friend . . - my mother.’ y 
‘hows this patient to be a person endowed with average intel- 
to a present func- 


‘Bence, b i Ed deterioration 
i ut showing marked deterio 1 
| 4 Total test behavior, 


t l : 
ional level of borderline intelligence. . 
ncluding the presence of serious deterioration and bizarreness, 


Appears to be that of a schizophrenic. 
Case 2: 


Ф Patient was impotent during a marriage relationship and 
€ possibility of latent homosexuality as à neuropsychiatric 
tterminant was raised. His intellectual functioning was on 
ап average level and did not show marked signs of intellectual 
Sterioration. His Bender-Gestalt drawings suggest unusual 

new, conventional situation, how- 


culty 1 S 

ty in a roaching a 1 t 
ж simple oe а of emotional elements. His drawings were 
*riously distorted and broken up with a frequent failure to 


Maintain simple contact with the simple elements that the test 


132 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


required. The net product was far below his intellectual level. 
'The drawings of a man and woman show very severe emotional 
regression in his attitude toward people with a complete denial 
of the reality of sexual differences. Just as sex played a 
traumatic part in his life, so the necessity of distinguishing 
sex in his drawings created enough of a disturbance to make 
his products grossly inferior to his potential ability. Psycho- 
logical study shows recurrent evidence of a very serious dis- 
turbance in the area of sexual attitudes and adjustments. 


Case 3: 


This patient showed little scatter on the Wechsler and 
showed the greatest loss of efficiency on those tests involving 
social-emotional situations where impulsivity led to excep- 
tional intellectual inefficiency. His Bender-Gestalt drawings 
reflected his high intellectual ability, indicated a perfectionist 
approach, yet showed several deviations from “perfection” re- 
sulting from impulsiveness. In spite of the intellectual superi- 
ority reflected on the Wechsler, the drawings of a man an 
woman are primitive, bizarre, non-social outlines devoid of 
detail. A one-eyed woman was drawn originally with breasts 
which were erased; the area was then crossed by a fragmentary 
arm. This appears to reflect severe emotional blocking, proba- 
bly of sexual and social genesis. His projection sheet, although 
at first reflecting intellectual evasiveness and facetiousness; 
reveals fear of people, of failure, and of death, and a sense 9 
personal insecurity. For example: “My best friend . . - 28 
myself”; “I hate . . . ugly and morbid things”; “Му greatest 
worry is . . . that I will not make a success in my lifetime." 
Psychological study of this patient shows that he retains а 
facade in the comprehension of a structured test situation 
(Wechsler) and consequently, shows no evidence of person" 
ality distortion on this test. However, when exposed to more 
ambiguous, more emotionally involved material, he reveals 
evidence of a basic personality disturbance. 


Case 4: 


. This patient functioned at the low average level of intel- 
ligence on the Wechsler despite evidence of higher origin? 
endowment. Perseveration in an extreme degree is evident 1n 
his reproductions of the Bender-Gestalt drawings, i.e. after 
copying a row of dots, he was unable to shift to rows of circles 
copying them as dots. After drawing lines, he copied a dot- 
figure as lines, corrected himself orally, began anew and for the 
second time drew a solid line. On the third attempt, he dre 
the dotted figure correctly. Bizarreness was manifested in his 
drawings of a man and a woman. During the execution of the 
latter drawing, he said, “I’d like to study drawing to know 


PROJECTIVE TECHNIQUES 133 


anatomy; a person out to know as much about the human 
Bible—I mean body—as possible.” The drawings show little 
identification with humans, no social awareness. The figures 
are unclothed, distorted outlines in no way resembling real 
people, definitely beneath the patient’s potential level. The 
responses on the projection sheet indicate a preoccupation 
with problems of spirituality and religion. They also suggest 
disturbance over problems of sex and reproduction. Strong 
guilt feelings are apparent. This patient's total behavior on 
the Wechsler and the projective techniques strongly suggests 
the presence of schizophrenia, as evidenced by perseveration, 
izarreness, lack of social awareness and preoccupation with 


abstractions. 
Case 5: 


. This patient's reproductions of the Bender-Gestalt draw- 

ings show an extreme perfectionist drive: much erasing, re- 
drawing, over-drawing. He also showed an obsession with 

small details; for example, concern with the exact arc degree 

of a tiny curve on the end of a long, wavy line, counting and 
re-counting the number of dots on originals and on his copies. 

arge flowing figures, twice the size of the originals, suggest 

lack of adequate control or restraint. His drawings of a man 

and woman are vacuous, geometric shells, markedly below the 

level expected from one of his intelligence, despite the present 

impairment. There is evidence of compulsive tendencies in his 

Preoccupation with the man’s hair. The prójection sheet 

Proved very challenging to him and consumed a great deal of 

time. There was much blocking, tacit debating, erasings and 

\ rewriting. He omitted many items, particularly those heavily 

) weighted emotionally. His responses are neutral and pu 
ightening except for an expression of antagonism pur 

rmy administration and of deep interest 1n home and Iu y. 

he psychological study of this patient shows a panon г above 

average original intelligence, the functioning of which is 1m- 


Paired quite seriously. There is evidence of an obsessive- 
pn | flected by extreme perfectionism, 


compulsi i e 
Ive neurosis as r x A A 
Obsession with small details and an avoidance of emotionally 


Weighed stimuli. 
- AR. 
With these techniques as with all clinical tests, the patient's 


арргоасһ to the task is important, 1.6 attitudes toward the 


‘ask, toward his productions, methods of work, gestures and 
*Xpressions, Definite diagnoses based on any one of the above 
techniques cannot be made, but the trends exhibited may be 
ntilized to derive implications and to reinforce inferences. 

uch research remains to be done before these projective 


134 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


techniques can be utilized as quantitatively standardized in- 
struments to explore personality. They should be used pri- 
marily to substantiate clinical findings leading to an interpreta- 
tion of the patient's total personality. 

Where the use of the above techniques has not led to а 
conclusive result, the Rorschach has been utilized. This tech- 
nique is by far the most important single projective technique 
that the clinical psychologist possesses, but must never be con- 
sidered as an end in itself, but rather as a means toward an end. 
As with other techniques, one must always consider whether 
the patient's responses are a typical example of his behavior 
and one must be especially cautious to avoid false positives: 

The use of the multiple-choice group Rorschach has been 
attempted at this hospital to meet the problems of time © 
administration, and the unavailability of skilled interpreters. 
The group Rorschach, however, departs essentially from the 
projective nature of the individual Rorschach in that it cease? 
to be a spontaneous test. In this respect it really becomes ? 
new instrument. Our experience with this technique does not 
recommend its use for patients already screened as being neuro 
psychiatric. Its greatest contribution at the present time wou 
seem to be in screening the “probably ill” from the «probably 
well" in much the same manner that direct personality rest 
are utilized. 

The Thematic Apperception Test has been of limited useful- 
ness to us because of the lack of skilled personnel possessing the 
deep understanding of psychodynamics, which is required 
the proper use of this technique. However, in those cases wher? 
the specific content of the social and emotional problems of t 
patient are especially significant, this test has been administer?” 

Projective techniques may prove dangerous and misleadi 
when interpreted by an individual who lacks the clinical jud” 
ment which comes from emotional maturity and broad testi? 
experience under competent supervision. However, in 
hands of a good clinician, they have proven among the mo? 
valuable weapons in the armor of the psychologist. Many ? 
the questions that psychiatrists seek help with cannot be = 
swered effectively by the psychologist unless these technique 


la 


PROJECTIVE TECHNIQUES 135 


р Оһ the other hand, there are cases in which they are 
: er applicable nor desirable. The final decision then must 
est upon the clinical judgment of the psychologist. 
Е си psychologist does not give a definitive diagnosis based 
э м tests. He simply supplies the psychiatrist with a work- 
54 ypothesis which results from his test findings. The en- 
ra tened psychiatrist does not consider this competition in 
+ ain but rather uses the psychologist’s report as one of the 
E D» that help fill in the jig-saw puzzle of the total per- 
nra In a military. setting the clinical psychologist must 
: with the psychiatrist. He must learn to use his skills and 
шше so that he gives his best to the cooperative endeavor. 
greater group of clinical psychologists and psychiatrists are 
d with what each other has to offer 
ped that there will be even greater 
f these professions in the 


казны better acquainte 
^. even before. It is ho 
crystallization of the relationship © 
Post-war period. 


SELF-IDEA COMPLETION TEST (5) 


Fin 
ISH THESE Sentences as RAPIDLY AS YOU Can. Write Down THE First IDEA 
Tuar Comes ro Your Mino. Let Tuis Be AN EXPRESSION 
or Your ReaL FEELINGS. 


1. H 
2. 1 апе to know а.н 26. The happiest time 
3. A eel DII 3 27. My great hope ..---- 
H Ar bedtime . i 28. The only trouble .. 
5. муу food 29. If only the Army - 
6. Bay fiet 30. The sharpest pain . 
7 home 31. I hate «en 
8. Г regret .. 32. I am уегу.... 

33. Most. officers ..- 


о 

в 

c 
© 
E 
A 


10; Qther people usually .. à 34 My job her d 


lp wjPY mother ....... : 
ы. It P puzzles те : Е an n ttie accum m 
3. паа my way. ‚ My тїй... 
14 Othe sergeants». 38. I failed... 
15, ег men ..... К ; Ha Ny education 

| This war «e 


16, МУ Rerves . | o 
17 МУ childhood .. | 41 І secretly. 
ан ое 42. Icannot understand what makes me 


Y great 
“М est fear . 1 t 
2 The most ders 7 "o ver 


20, most d. 

21. wicker дй ee БЯ UB 45. My family ne Fi í 

22, MY father nsed to ы... 46. My most important decision was . 

33. TE, hardest job iR s 1. M ra worry is «e 
: eni ENVY «eem ке 
> попу рш 49. If only ++- 


е atte 50. Today, I .----- nn 
E орин pe (Add anything you wish to say) 
e t Я 
© reader: Second column of stimulus phrases (26-50) ape on reverse 


side of test blank so as to permit adequate space 


136 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


SELF-IDEA COMPLETION TEST (ABBREVIATED FORM) 


Finisn Turse Sentences as Fast as You Can. Write Down THE First IpEA 
Tuar Comes To Your Міко. Ler Tuts Express Your Rear FEELINGS. 


TE Te Pee eres eee eius Vs s ede ern ten 16. I am best when ........ t 
Die THEY ATI YS: меним иу „зынк 17. My most dangerous ss 
3. My father used to .............. 18. I ether ууз ve cuts stearate 
4. апе to: Know sis si essen scons 19. My nerves ...........+* 5 
S MY WEOE ЖО АРИНА 20. My greatest fear ...---++++° 

6. A woman ........ м 21. What puzzles те........- 

7. My greatest hope . s 22. Right now ... 

B. ШЕШУ шылыш stes дайнын. 32. Most officers . 

S. LIVE BE ооган wane mnie 24. I failed... 
guis MEME 25. My education .... 

ТЕСТ e аласара 26. D envy „азала 

12i Мой giis „асаалаа я 27. What annoys me ..... 


13. If my mother 28. My greatest worry eee 
14. Other soldiers А 29. The happiest time ...--- 
15. Before I was in the Army 30. When I was а little boy -++ 
(Add anything you want to say) e 
Note to reader: Second column of stimulus phrases (16-30) should be put on revers 
side of test blank so as to permit adequate space for projection. 


TENDLER EI. TEST (6): 


SAMPLE 
DEAE NEN; E нки acti tete e 
I sleep when . 


тед, онза Еи, 
. My hero is E 


I get angry when 


PT feel hapy whei 07s кылы көнөн 
log oe uou Meana eE E ЖАЛГЫ Шука кка weit 88 


1 
2 
3 
4. T love езект мы. ше. 

ALIE e PETET AIE SERRE ee ПРЕ ОНАР 677 
AAM iao en O A MS ME 
7 

8 


REFERENCES P 
Bender, L. A Visual Motor Gestalt Test and Its Clinical Us 


2. наев. отк: Атегісап Orthopsychiatric Association, 


nr y y 

Some Uses of Projective Techniques in Militar? 

Clinical Psychology." Bulleti inger Сий 
IX (1945), 89.937" ulletin of the Menning 


* Reproduced by permission of the author and the Journal of Applied psycholog)" 


_———_. 
oo ease —--_—-—_-— 


с ст A 


PROJECTIVE TECHNIQUES 137 


- Hutt, M. L. A Tentative Guide for the Administration and In- 


terpretation of the Bender-Gestalt Test. (Mimeographed.) 


- Psychology Section, Mason General Hospital. A Guide to the 


Use of the Bender Gestalt Drawings. (Mimeographed.) 


+ Shor, J. Notes on the Use of the Self-Idea-Completion Blank. 


(Mimeographed.) 


- Tendler, A. Р. “A Preliminary Report on a Test for Emotional 


Insight.” Journal of Applied Psychology, XIV (1930), 
122-136. 

Wechsler, D. Measurement of Adult Intelligence. Baltimore: 
Williams and Wilkins, 1944. 


ES" 


PSYCHOMETRIC TESTS AND CLIENT-CENTERED 
COUNSELING 


CARL R. ROGERS 

University of Chicago 
o bas hears various superficial and distorted statements as 
е viewpoint of client-centered or nondirective counseling 
Tegarding the use of psychometric tests. Such statements 
often include the notion that the client-centered counselor is 
against all tests” or “has no use for testing.” These state- 
Ments have their bases in the fact that the client-centered 
Counselor makes less use of tests than the counselor of the tra- 
a ORT diagnostic-prescriptive viewpoint, and uses them in 
ery different fashion. What are the reasons for these 


Merences? 


Some Principles of Client-C. entered Counseling 


its The Primary fact which has given nondirective counseling 
Impetus is the realization that a predictable, measurable 
ress can be set in motion in the client—a process which re- 
ps Ses forces of self-directing initiative, and forces making for 
res. chological growth. As this process has been studied by 
"* Arch means! it becomes clearer that adherence by the coun- 
im, to certain basic principles involving both attitudes and 
ang dures tends to further this process of client reorientation 
* &towth. Some of these principles, as they are seen at the 
Sent time, are as follows. . 
- The counseling process is most likely to take place if the 
selor is an accepting, nonevaluating person, able to accept 
? client as the client views himself. "m 
tereq he process is furthered by keeping responsibility eens 
оп the client. This should be true of all the minor aspects 


is E "ү : R “Counsel- 
ing» "Or а summary of recent research in this feld see: Carl R. Rogers. 
eview of Брасида! Research, XV (1945), 155-163. 
139 


Coun 


140 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


of counseling, as well as the major aspects, if the client Ec 
ally to feel that this is his situation, to use as he desires. 
is left with the client the initiative for deciding what aspects о 
his situation are of concern to him, what topics he wishes to 
discuss, what attitudes he is ready to express, what ae 
the conversation should take, whether he wishes to иө ҮС 
another appointment, et cetera. (This keeping of responsibi S 
with the client, and putting the development and direction к 
the process in his hands, is genuine. It is not, as so many d 
to assume, a smooth way of subtly directing the client by та a 
ing him think he is responsible for what is going on. In жаг 
centered counseling he is responsible in the most complete sen 
of these words.) | the 
3. The central principle of this counseling process is that i 
client, finding himself and all his contradictory attitudes pe 
cepted, can drop his psychological defenses, can find те 
from emotional tensions and by examining those rmt 
himself which he has customarily denied and repressed, сара 
velop a new and very different concept of himself with we 
to face the world on a much more realistic basis. Не starts 9" 
fresh as his real self. S 
4. The counseling process is furthered if the counselor dem 
all effort to evaluate and diagnose and concentrates solely © 
creating the psychological setting in which the client feels pi 
is deeply understood and free to be himself. It is unimport? a 
that the counselor know about the client. It is highly imP? 
tant that the client be able to learn himself. (Not to 18 
about himself, but to learn and accept his own self.) — . : 
In making use of these principles the counselor examines ro- 
own attitudes and techniques and endeavors to refine his k e 
cedures so as to eliminate all which are not in accord with сеге 
basic principles. Thus questions are eliminated from the cie i 
view because they invariably direct the conversation, advic si 
eliminated because it assumes the counselor to be the respon”, 
ble person, diagnosis and evaluation are put aside pecans a 
has been learned that even when they are not voiced they a E 
to distort the counselor's responses in subtle ways and to it. 
down his full acceptance of client attitudes. In similar W 


CLIENT-CENTERED COUNSELING AND TESTS 141 


each customary counseling procedure, and any new ones which 
may be proposed, are considered and evaluated in terms of the 
Principles which seem to be operative. 


Application of Principles to Testing as a Technique 


_ Psychometric tests are considered as another possible tech- 
nique for the counselor’s use and are considered in the light 
of the same principles. They do not stand up well as a 
technique for client-centered counseling. If the counselor sug- 
Bests the taking of tests, he is both directing the conversation 
and is implying, “I know what to do about this.” To ad- 
Minister tests routinely or to have them administered at the 

€ginning of the contacts is to proclaim in the strongest possi- 

€ terms, “I can measure you, can find out all about you,” 
and this implies to the client that the counselor can also tell 

im what he should do. For the counselor to interpret tests 
to the client is to say, “I am the expert, I know more about you 
than you can know yourself, and I shall impart that superior 

Nowledge.” In other words, when tests are used in the tra- 

Itlona] fashion they contradict almost completely the prin- 
“iples of client centered counseling. They make the counselor 
Primarily responsible for the process even though he shares 
E responsibility with the client. They are by dd very 

ure evaluative, passing judgment of one sort or another on 

€ client, They tend to make the client feel that only the 
“Xpert can know about him rather than make him feel that he 
Can discover himself. Because they have norms, they make it 
pone difficult for the client to accept himself when he differs 

tom the norm or from the accepted standard. . і 
is. Y every criterion, then, psychometric tests which are in- 
€d by the counselor are a hindrance to a counseling process 
Ове purpose is to release growth forces. They tend to in- 
: азе defensiveness on the part of the client, to lessen his ac- 

Ptance of self, to decrease his sense of responsibility, to create 


n : Е 
is attitude of dependence upon the expert. Consequently, it 
ounselor has used a client- 


Our е 
X cea c 

Ce perience that on 

y rs approach, once he has observed the release of con- 
Uctive forces a the client, he is no longer willing to use 


8 ҖЕ 3 
Ychometric tests in the traditional fashion. 


142 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Testing is not necessarily completely excluded from the 
counseling process, however. The client may, in exploring his 
situation, reach the point where, facing his situation squarely 
and realistically, he wishes to compare his aptitudes or abilities 
with those of others for a specific purpose. Having formulated 
some clear goals, he may wish to appraise his own abilities 1m 
music, or his aptitude for a medical course, or his general intel- 
lectual level. A girl with many deeply neurotic characteristics, 
whose initial interviews were filled with references to the great 
researches she expected to carry on and the significant books 
she was going to write, requested, in one of her last interviews, 
that an intelligence test be given to her. She was able to accept 
quite realistically the fact that her ability was above average 
but was in no way outstanding. 

Consequently, when the request for appraisal comes aS ? 
real desire of the client, then tests may enter into the situation- 
It should be recognized however that this is not likely to occur 
frequently in practice. It should be further recognized that the 
significant elements with which the counselor deals are the 
emotional attitudes of satisfaction, doubt or fear which the test 
creates. It is not the factual test results but the attitudes 0 
the client toward the test results, which are important in the 
counseling process.” 

Summarizing the situation briefly we may say tha 
positive results of client-centered counseling appear to be due 
to the fact that the process is centered in and determined by 
the client. For this reason tests are never used on the cons?” 
or's initiative as a part of counseling. Tests may be desire¢ 
the client and introduced at his request, but even when this Ё 
done the focus of counseling remains on the emotional attitudes 
which are expressed—whether these attitudes are concerne 


а ы í *ent 5 
with psychological tests or with other aspects of the clien 
environment. 


t the 


5 
_ ŽA very satisfactory discussion of ient- he use of em 
is contained im the article by Kay and Viena Didius "Clima Counseling 
Vocational Guidance," Journal of Clinical Psychology i (1945) 186-192. like 
article points up the fact that even in a setting where students and counselors ings 
have regarded tests as the center of all counseling, a client-centered approac? t ust 
Em a very different orientation on the part of the client, and a very differen 

of tests. 


CLIENT-CENTERED COUNSELING AND TESTS 143 


The question may be raised, “How does the client know that 
tests are available if this is not mentioned by the counselor?” 
he answer is simply that he would not know, and that as far 
as Present knowledge would indicate it is not especially impor- 
tant that he should know. The client may work out his re- 
“tionship to life by considering such diverse topics as the way 
11 which he deals with people, the results on his psychological 
test, the manner in which he decides what clothes to wear, or 
the Teaction he feels when his father speaks to him. In other 
Words, it would seem that the individual can consider his own 
Pattern of reactions as they are evident in many different situ- 
ations, and his pattern of reaction to a psychological test and 
its findings is one such possibility. The client will probably 
Ваш just as much by considering and working out an orienta- 
Uon to his father’s evaluation of him, as to a psychological test’s 
evaluation of him. Research is needed to throw further light 
on this, but clinical experience would suggest the viewpoint 

PR €xpressed, 

Other Uses of Tests 


Though the above discussion may make it plain that psy- 
ton tests are of minimal importance in the client-centered 
s nseling process, this is in no sense an attack upon tests as 

uch, It may be well to point out that in other connections 
"^d for other purposes tests have a great deal to offer. 

п the first place, where a person or an organization must 
make а responsible decision about an individual, tests are very 
e! hen a medical school must choose 100 out of 300 
tuj ates; when the army is selecting men who iin ua 

ia learning Japanese; when an industry is selecting 


large &roup of applicants those best fitted to be welders, then 


te . . . . 
T Constitute a useful tool. These are situations in which 


Багы Ponsibility lies not with the individual, but "€ BN 
ls s Y- The individual is not free to make his apr ains si 
Such J€ct to the decisions others will make about m. а 
Most wanes tests provide one of the fairest, most os он Й 
that objective means of making these таратат рг s г 
аге he tests are adequately constructed for the purpos 


S A m ample, tests 
‘tisfactorily administered. We need, for example, 


144 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


which will select those who have good potentialities as 
counselors. 

The second major use of tests is in the field of research. 
The objective measurements upon which research is based are 
often best supplied by tests. In the field of counseling and 
psychotherapy, for example, several studies are nearing COT 
pletion in which personality tests and other evaluating devices 
are being used. Tests are given prior to the beginning of 
counseling and following its conclusion in order to see what 
measurable changes occur. It should be noted, however, that 
this is to serve a research purpose, not a counseling purpose: 
The client is not told why the tests are being administere® 
except that they are part of a research study. He is not to 
the results, nor is the counselor made aware of the results 
Thus the damaging effect which testing has upon the proces? 
of client-centered counseling is avoided, but research interest? 
are very definitely served. 

In conclusion it may be said that the counselor who ^ x 
come to use the client's motivation for growth as the main 
spring of the counseling process is not opposed to tests, 
found them unsatisfactory for promoting client growth. 
one thing, counselor-administered tests interfere with the pro 
ess of catharsis, insight, and positive choice which has ped 
shown to be characteristic of growth as it takes place in therapy’ 
It also seems to the client-centered counselor that the measur 
ment of abilities and personality traits as though they Y“ g 
static loses much of its significance in the light of counsel?! 
experience. The changing and dynamic use the indiv! lity 
makes of his abilities, the self-initiated changes in person? E 
characteristics which occur as a result of counseling, 5001 ` -jes 
more important than the measurement of these fluid ent 
in terms which give them a spurious permanence. Only * (5 
(1) the need to take tests is a significant aspect of the de 
symptomatic behavior, or (2) it is impossible for th ^ ire 
to be responsible for a choice or (3) research purposes E uw 
a measurement of an admittedly changing characteris the 
psychometric tests seem to have a purpose with whic 
nondirective counselor can agree. 


TEST INTERPRETATION IN VOCATIONAL 
COUNSELING 


RAY H. BIXLER 
University of Minnesota 
AND 
VIRGINIA H. BIXLER 
Vince A. Day Center 


f 


фы ee are two aspects to the problems of test interpreta- 
LS 1) the presentation of test results and their predictive 
aem in a manner which. is understandable to the client, 
facilit ) methodology of dealing with the client in order to 
un d his use of this information. The ultimate goal of 
ES guidance is not only accurate prediction but also 
ae al use of the prediction by the client. It is in this respect 
Vocational guidance differs most from personnel selection. 
Vocational counseling as a process has not received a great 
ы of attention in the literature. Neither formal discussions 
hs dne records provide the counselor with an understanding 
mere], the counseling process develops. bu usual case record 
ae states “Tests were interpreted. There has been no 
т ти study of various counseling procedures and their 
ever Туепеѕѕ when tests are introduced into the process. How- 
» It is only through the evaluation of counseling processes 


at we shall be able to improve the more subjective skills in- 


Vol 4 1 A. Uy Я 
nd in dealing with client motivation—the factor which 


itates or handicaps his use of job information, test data, 
academic planning. 

One i ud seems to 

) the Ө: уез the opinion of the co 

aken ^id deals with the prediction alone 

rom case records to illustrate eac 


ons of the Counselor 
d to scientific). 


fall into two broad categories: 
unselor as well as the data; 
. Examples have been 
h approach. 


Interpretations Involving Opim 


l. Clinical Interpretations (as oppose 
145 


146 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


George verbalized an interest in medicine. Measures of estab- 
lished predictive value indicated that the vast majority of stu- 
dents at his level of academic aptitude and achievement would 
succeed. However, he earned a low score on the Cooperative 
Natural Science Achievement Test which has little or no pre- 
dictive value for medicine (at the University of Minnesota). 
The counselor felt this was evidence that George would be 
handicapped in the pre-medical curriculum. On this basis, he 
urged George to go into business or law, his secondary interests. 

2. Interpretation Involving Persuasion. Robert’s test re 
sults indicated that success was more likely for the majority 
of students with scores like his in fields other than engineering» 
his preference. In interpreting the tests the counselor €x- 
plained that he had a better chance of success in business an 
urged him to enter this field because he would be happier in а 
field where he was successful. 

3. All or None Interpretation, A graduate student who 
received a percentile rank of 19 on the Miller's Analogies a$ 
compared to graduate students, came to one of the writers 1" 
tears saying, "Dr. X. told me that graduate work was the last 


thing in the world I should be doing. He said I had no business 
even attempting it.” 


Interpretation Involving Little or No Opinion 


1. Statistical Prediction Applied to the Individual Clie”: 
“You have an 80% chance of succeeding in agriculture ап 
60% chance of succeeding in business.” à 

2. Straight Statistical Prediction. “Eighty out of 0P% 
hundred students with scores like yours succeed in agricultu' 
while sixty out of one hundred succeed in business." -— 

The above interpretations need not be mutually exclusi 
and seldom are. In order to evaluate these approaches; on 
must consider them in relation to difficulties which may hi? " 
the client's acceptance of information which is offered. 

Distortion of information on the part of the clien 
frequent obstacle. The client's desires and fears interfere wie 


the ii he may make of information and may color 
interpretations. 


cis? 


a 


TEST INTERPRETATION 147 


Even in the traditional information-giving situation of the 
classroom, instructors are aware of the fact that distortion does 
Operate. The grading of examinations at the end of the quarter 
Verifies the ineffectiveness of books and lectures in giving in- 
formation to students. Vocational test interpretation is much 
More personalized and there is greater opportunity and reason 
for the student to distort or disregard information given to him. 

It is not difficult to find examples of the distortion of data 
by the client. One young man who had been tested and coun- 
Seled reported to the speech clinician who was responsible for 
the referral, that he was in the upper 20% of the general popu- 
ation in intelligence. In response to a question about the rest 
of the tests he replied that that was all the counselor had told 

im. In reality, this client had been given information con- 
cerning the complete battery of tests he had taken. He had 
chosen to remember only that aspect which was important to 

im. An emotionally immature adult, he was the rejected 
member of a family of three sons. He didn’t go to college as 

ad his brothers because his father decided he was “too dumb. 

Пе would expect him to cling to his intelligence test results 
Which seemed to vindicate him. All other results were quite 


*Xtraneous to his needs. 

Another client, after being told his results on the Kuder 
reference Record, and their significance, decided that they 
Meant he should go into engineering despite the fact that his 
COmputationa] interest score was at the twentieth percentile 
and his only high percentile was persuasive. As he said him- 

Self, he “had never thought of anything but —— a 
€ports often filter back to a counselor about the | things 
"eCommended and discouraged" which have no basis in fact. 
he distortion of information is usually more in keeping with 
"ne desires of the client than the actual test results. Distortion 
Seems to occur more frequently with interest and personality 


tests, 

Another obstacle to optimal use of test interpretation by 
е client is the occasional traumatic effect of the predictions. 
Alling students frequently turn to vocational tests in the ара 

*termining another field in which they can be successful. 


148 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


When test results indicate that they are not suitable college 
material they are brought face-to-face with a terrifying fact. 
Their defenses are stripped from them by the concreteness of 
the data. Here test results operate in much the same manner 
as an interpretation of emotional behavior to a disturbed client 
who is not ready to accept it. Intellectual recognition © 
limitations can be traumatic when it is not also accompanied 
by an emotional acceptance. Р d 
Therefore, in choosing a method of test interpretation an 
guidance, the counselor must remember that the client may 
find it necessary to distort or disregard information or that he 
may become disturbed by its significance. Р 
The method of test interpretation and vocational counseling 
described in the remainder of this paper has been employe 
with college students and in the rehabilitation of the tubercu- 
lous. Any evaluation of it must be empirical at present. 
In accepting any philosophy of counseling, one’s ans 
to the following questions are pertinent. rt 
1. Shall the counselor's goal be to avoid failure on the p? 
of the client? 
2. Shall the counselor pave the way for the client? á 
3. Shall the counselor contribute his opinion as well 2 
information? «ion 
(There are many who feel that the counselor's орнаса 
is his major contribution because there are now 41625 : 
which there is relatively little scientific certainty) |. nt 
4. Shall the counselor adhere to the concept that the we 
is fundamentally responsible for the decisions made 4 
the manner in which they are carried out? the 
In other words, how much can the counselor respect ling 
integrity of the client? The method of vocational couns? jas 
which will be presented is based upon this faith in the fan a 
mental integrity of those we assist. The counselor does * 5 
urge a plan of action nor does he set goals. The couns® 
responsibility is to give the client information, clarify his n9 
titudes toward that information and towards his limitati? 
and finally to assist him in implementing his plans. he 
How the process of vocational guidance is structure 


wers 


TEST INTERPRETATION 149 


client will affect his reaction to this method of test interpreta- 
tion. The preliminary interview has been described elsewhere. 
Ча не taking the battery of tests the client is given an in- 
D etation of the results without the counselor's opinions. 
е results are presented in general terms and illustrated with 
examples, A student in the upper 10% of his high-school class 
and in the upper 25% on a college aptitude test might be told, 
We have found that the best indication of success in most 
iw courses is how well you do in high school and how you 
on a learning ability test. You were in the upper 1075 

of your high-school class and exceeded seven or eight out of ten 
college students on the learning ability test. Most people with 
Scores like that learn complex things relatively easily and 
quickly. For example, most students with scores like yours 
Would succeed in college and get better than average grades.” 
he counselor should use actual prediction tables when they. 
are available, The last sentence of the interpretation then 
might be “Eighty out of one hundred students with scores like 
Yours would succeed in college and sixty would get better than 


a 
Verage grades.” 
. The counselor does not persona 


cl : е 
E nor does he imply in any fas e thinks 
lent's course of action should be. This responsibility 18 as- 


Sumed by the client. This tends to free the client to discuss 
15 reaction to the test results and to clarify the application he 
pd make of them to his problems. The counselor who states, 
t 9u ought to do excellent work in college," will probably find 
Те client less responsive, and as а result will be of less service 
helping him to integrate the data with his personal desires. 
ven the interpretation of low academic aptitude should 


b 
* handled in the same factual manner. Some counselors can- 
i k with such clients, while others 


я lient towards ап occupation re- 
e iring little academic aptitude. It would seem that the in- 
TPretation of low scores in the same way 25 high scores is also 
ы of ethics. Clients in a neutral setting can and fre- 
to ently do make a real growth in self-acceptance if they are free 
&lve vent to their anxieties and disappointments. 


lize the prediction for the 
hion what he thinks the 


150 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Personality and interest tests are difficult to interpret since 
they are demonstrated to have little or no predictive value, and 
the question of what they actually measure remains unans- 
wered. In spite of this inadequacy some vocational counselors 
still base their decisions about which field the client should 
enter largely upon the way a client classifies himself on an 
interest test. Sometimes other tests such as the Minnesot? 
Clerical and achievement tests are used to encourage clients 
to enter fields for which they have no predictive value. Per- 
haps this is due to the need felt by the counselor to give more 
than he can in terms of vocational advice. : 

The following procedure is suggested as an alternative 
“This test gives us an indication of what you may enjoy dong: 
So far as we can tell it has nothing to do with how successfu 
a person will be in a field. The majority of people with scores 
like yours enjoy helping people. (High social service—artistic 
and musical,secondary.) Fields like social work, clinical psy- 
chology, nursing and teaching appeal to them. People WIf^ 
scores like yours are also somewhat interested in art and musi: 
These are areas which combine both of these interests, like 
occupational therapy and nursery school work.” This interpre 
tation is impersonal; it enables the client to relate it to himse™ 
or to reject it, and it frees him to clarify his own motivation” 
Interpretation of personality tests is even more challenging. 
Because they deal with the most personal qualities of е 
individual, their interpretation is often traumatic. Neithe” 
of the writers feels he has found a satisfactory method of 1? 
terpreting these tests to clients, and for the most part does n? 
attempt to do so. Of the Bell Adjustment Inventory; for 
ample, one could say, *You seem to feel that you have 9". 1 
difficulties at home than you do at work, or in your ? 
living." " 
A client categorized as maladjusted is usually unable t° ш 
it in a constructive sense. On one hand such a person 
necessary to rationalize his test results, or otherwise. 
himself, making it difficult for the counselor to serve him ee 
therapeutic sense while, on the other hand, his problems “a 
intensified by this seemingly undeniable objective measure 
his weakness. 


^E 


TEST INTERPRETATION 151 


bos rp edi with eie. clients come to counselors quite 
we нне ides ity test interprerations given by others, 
h ine with our own experience, mounting evidence 
that when such interpretations are given at all, they must be 
adroitly handled. 
фы ин tests do not seem to contribute to the psycho- 
apist since they yield symptomatic diagnosis rather than 
any picture of causal relationships. Their use in personnel 
Selection seems justifiable even in their present stage of de- 
velopment, but it is difficult to know what they can contribute 
to counseling. 

Actual statements of prediction 
Phase of vocational counseling. When the client begins to 
apply these predictions to his own plan, deciding what they 
mean to him, and what he wishes to do as a result of them, the 
More crucial phases of counseling have begun. The client either 
Integrates the test predictions into his thinking and thus makes 
Use of them, or he distorts or rejects them. The more he feels 
free to discuss his reactions with the counselor, the more likely 
It is that he will come to a logical acceptance of their signifi- 
cance, The following case excerpts illustrate this phase of 


vo s ә 
cational counseling: 


are only the beginning 


ich demonstrate that students’ ranks 
with the way in which they compare 


with other entering students in mathematics, are the best 
indication of how well they will succeed in engineering. 
Sixty out of one hundred students with scores like yours 
succeed in engineering. About eighty out of one һи 
succeeed in the social sciences (names several). | е 
difference is due to the fact that study shows tne co све 
aptitude test to be important 1n social sciences, along wit 

high school work, instead of mathematics. 


But I want to go into engineering. I think I'd be happier 
there. Isn't that important too? 


with the way t 
liking engineering 


c There are studies wh 
in high school along 


- You are dissapointed he test came out, but 
you wonder if your better isn't pretty 
Important? й 

8, Yes, but the tests say І would do better in sociology or 
Something like that. (Disgusted. 


* That disappoints you, because it’s the sort of thing you 
don’t like. 


152 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


S. Yes, I took an interest test, didn’t I? (C nods.) What 
about it? - 
C. You wonder if it doesn't agree with the way you feel. ле 
test shows that most people with your interests enjoy 
engineering and are not likely to enjoy social Гуш 
S. (Interrupts.) But the chances are against me 1n eng 
ing, aren’t they? | | | 
С. It seems pretty hopeless to be interested in engineering 
under these conditions, and yet you’re not quite sure. А 
S. No, that’s right. I wonder if I might not do better m 
thing 1 like—Maybe my chances are best in engin Te 
anyway. I’ve been told how tough college is, an cat 
been afraid of it. The tests are encouraging. There! ih 
much difference after all—Being scared makes me over 
the difference. 


3 d З 3 е at ease 
He decides to go into engineering and seems quite 


with his decision. he stu- 

The next excerpt portrays a different problem. T а ine 
dent has been in pre-medicine for two quarters and 1s а 
ning to fail. His scores on all tests are very low. Some 
planation of prediction has already been given: 


with 
C. About two or three students out of one hundred 
scores like yours succeed in pre-med. 


S. I knew they'd turn out like that. (Disappointed.) 

C. Even though you expected this, it’s pretty hard to мө 

5. Yes sir, but I got off to a bad start this year. It’s the Dean 
story. My advisor discouraged me, so did Mr. R. aes uy 
X's office, and now the test discourages те. I bud start- 
another quarter next fall with a fresh start. I nm ril 
ing new with a good rest I can do it. If I fail E at yet 
know I can't be a doctor, but Im not satisfied with t^ 


C. You feel everything discourages you, but you haven't e . 
yourself a fair trial. You think next fall will tell the 5 he 
- Yes, I do, even though they didn’t agree with те, and t 
tests are on their side. ^ of 
The third illustration deals primarily with isto а : 
data. The client's interest scores were typically es tile) 
(99th percentile.) Other scores mechanical (72nd percen C 
computational (20th percentile) science (70th percentile í 
has already interpreted results. 


a 


\ 


y 
TEST INTERPRETATION 153 


. That means I’m best suited for engineering, doesn’t it? 
That’s the way it seems to stack up to you. 

- Yes. (Turns the discussion to persuasive fields and merits 
of various phases of them then.) I really ought to be 
much more interested in mathematics to go into engineer- 


ing, shouldn’t I? 


to ta 


Th The trauma of low scores is illustrated in the next excerpt. 
3 e counselor has indicated that about fifteen to twenty stu- 
ents out of one hundred succeed in college: 


S. (Looks stunned, then confused.) 

C. This is awfully disappointing. 

S. Yes, it is. I had hoped I'd find something I could succeed 
in. 

C. It seems to leave you without anything to go into. 

S 


- Yes, but I can do the work. I have trouble concentrating, 
my study habits are poor, I never studied in high school 


and I don't know how. 
- You feel the reason for your trouble is your 
habits, not a lack of ability. 
igh school, but I didn't 


S. Yes, I didn't get good grades in hi 
study either. Now when I want to study I worry and 


get tense. My mind goes blank when I take tests. 
C. You're pretty worried about your school work and that 
seems to make it harder to succeed. (Pause.) 
+ It’s my last hope. (Head sinks on chest, lips quiver.) 


- You’re so upset about this you feel like crying. 


(Does) I feel so silly. (C recognizes her embarrassment, 
and she continues to cry and discuss various elements of her 
anxiety about school.) Гуе got to make good. I’m not 
as smart as most kids, that’s true. There are, lr E 
jects that go over me, but I think I can make it. on't 


know what to do. М А 

С. You have to make good and yet you're afraid you can't. 

t leaves you pretty badly mixed up. 
sol 5. decides to continue seeing C. until she can work out a 
ution. She leaves interview accepting her limited ability, 
Ut is not sure which of several courses to take. 

he counselor has made no attempt to correct distortion or 
nCourage a plan of action, or to comfort the client through 


e : d 
?SSurance, Tf the counselor has given the client an adequate 


poor study 


О 


aa 


154 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


interpretation, further explanation at that point is of less value 
than an opportunity for the client to come to grips with his 
motivation for distortion. In the first illustration the prospec- 
tive engineer not only arrives at an excellent application of a 
test to his own problems, but is capable of minimizing the 
anxiety he has held for college work when he has insight. In 
the second and third illustrations the counselor could have 
stepped in to correct the client’s application of test data to 
himself, but it is questionable whether this would have achieve 
anything. The counselor's acceptance and clarification of the 
client's attitude did seem to bring each to a better under- 
standing. Р 

The pre-medical student brings into focus the ineffective 
ness of authoritative advice when the client is not in agreement 
Discouragement on the part of this student's advisor, the dean? 
assistants and the test data was ignored. Perhaps it is десе 
sary for some students to be faced with the reality of failure 
in order to change their goals. This client’s goal probably 
never could be changed by counselors. The persuasion ài 
counselors only motivated him to strengthen his defense? ap 
postpone acceptance of the inevitable. This client may return 
for further help if he feels a need for it, because the counselor 
has not made eventual failure an issue between them. 

In the last illustration the client is able to expres? 
anxiety, to obtain a better acceptance of her limitations, er 
to come to a realization that there is a solution. She was е. 
disturbed by her college experiences to date, and the test H 
sults intensified this. The counselor’s recognition and clar 
cation of feelings has been instrumental in her expressio? 
these anxieties and her subsequent modification of them. r 

When the counselor allows the client to make his ow? pu 
sonal interpretation, he is free to express these attitudes whic 


H а . ex 
so frequently interfere with his use of test data. As à op 
presses them to an accepting counselor, there is a NT ef 


portunity for them to dissipate and the client will gain OF ge 
insight into his motivation. It is only as the client can ы cst? 


stand and accept himself that he can make actual use ? 
or other data. 


TEST INTERPRETATION 155 


Recognition of elements in vocational guidance which are 
emotional rather than intellectual in nature allows the counselor 
to become more effective in helping clients. 


Summary 
Vocational counselors should utilize not only test interpre- 
tation and vocational information but also techniques to facili- 
tate the client’s utilization of this data. Counselors should: 
l. Give the client simple statistical predictions based upon 


the test data. 

2. Allow the client to evaluate the prediction as it applies 
to himself, 

3. Remain neutral towards test data and the client’s 
Teaction, 

4. Facilitate the client’s self-evaluation and subsequent 
€cisions by the use of therapeutic procedures. | 

Avoid persuasive methods. Test data should provide 


motivation not the counselor. 


1 REFERENCES s 
* Bixler Bixler, V. H. “Clinical Counseling in Vo- 
d BN Journal of Clinical Psychology, I 


1945), 186-192. | 
* Rogers Cont R. Counseling and Psychotherapy. New York: 


Houghton-Miffin Company, 1942. 


MEASUREMENT NEWS* 


A new editi 

unde ition of the Mental Measurements Yearbook is 
gers oo by Dr. Oscar K. Buros, who has returned to Rute 
Within нас а The new Yearbook is scheduled to go to the printer 
of the Sta те months. As а major in the Army, Dr. Buros was Chief 
S.F H ndards Section, Office of the Director of Military Training, 
- Headquarters, until the end of 1945. , 

LN a 

Шыл research done on “what the soldier thinks” by the Research 
of the Information and Education Division of the War De- 


Partment j ‹ H 
nt is to be reported in a series of four volumes now being pre- 
f the Social Science Research Council. 


Lel 

Sia E De Vinnes, Dr. Carl Hovla 
the end of 1946. It is hoped that the vo 
and Measurement has been formed in 


he American Psychological Association. 


elected: Dr. L. L. Thurstone, chair- 
ry; Dr. Henry E. Garrett, 


olonel M. W. Richardson, 


the ^A Division on Evaluation 
The 19% reorganization of t 
man; Dr RE officers have been 
т. 2. Florence Goodenough, secreta 
divisi arold Gulliksen, and Lieutenant С 
Опа] representatives. 
ne Dean Edmund G. Williamson has been elected chairman of the 
th y formed Division of Personnel and Guidance Psychologists of 
, p Merican Psychological Association., Lieutenant J. G. Darley 
Other сп elected both secretary and a divisional representative. The 
. Eq ivisional representatives are Dr. Alvin C. Ешісћ, Dr. Harold 
gerton, and Dr. C. L. Shartle. 


ent and interpretation of 
Bernardino Air Technical 
onnel Testing Unit. 
ailable to those in- 
Captain Fred N 
San Bernardino Air 


evelopm 
the San 
d by the Pers 


A 
Tests i mary of studies on the d 
Service un have been conducted at 
imit amend has been prepare 
ted. number of copies of the 
endrick Requests should be addressed to 
Technic 1: Chief, Civilian Person 
ical Service Command, 8 
s. eration to the Editor, EDUCATIONAT 
s N.W., Washington 5,D.C 


Any pRead s 
Psyouons are invited to send notes 10 
Locica Measurement, 917 Ё ifteenth Street, 

16 


158 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The Council of Guidance and Personnel Associations which in- 
cludes the National Vocational Guidance Association and the Ameri- 
can College Personnel Association will hold a regional meeting In 
Cincinnati from March 22 to 23. Speakers will include Dr. John 
Darley, U. S. Navy, Dr. Carl R. Rogers, University of Chicago, an 
Mr. A. F. Hinrichs, Acting Commissioner, Bureau of Labor Statistics: 
Dr. Darley will address the group on “Vocational and Educationa 
Postwar Testing.” 

. The Personnel Research Board, Ohio State University, is be 
ginning a series of studies under the title *Executive Leadership 1? 
a Democracy." The first study will be conducted by Dr. Carroll 2: 
Shartle and will involve an analysis of the executive positions an 


organization structures in farm organizations in the Middle West. 


Dr. Herbert S. Conrad has resigned from his position as Chief of 
the Examination Methods and Statistical Analysis Unit, U. 5. Cis! 
Service Commission, to return as Technical Consultant at the College 
Entrance Examination Board, Princeton, New Jersey- 


Colonel John C. Flanagan, U. S. Air Corps, was recently awa 
the Legion of Merit for exceptionally meritorious conduct in the Th 
formance. of outstanding service for the Army Air Forces. 
presentation of the medal was made January 8, 1946, by G en 
H. H. Arnold. The citation read in part: “Colonel Flanagan P^. 
neered in the establishment and development of the Army Air pare 
Aviation Psychology Program and by his ingenuity in directing psy 
chological research he contributed signally to the developmen E 
effective selection and classification procedures for Army Air F er 
personnel, which has resulted in the improved utilization of manpow 
and the creation of a more effective striking force." 


eneral 


Staff members of the Advisement and Guidance Service o M^ 
Veterans Administration have recently returned from conducting se 
series of conferences which covered the United States. In attend, 
were a large proportion of Veterans Administration Vocation? ту, 
visers and Chiefs of Advisement and Guidance and Training с 
bett a an well as some Training Officers and counselors at colle? 
and universities which have contracts with the Veterans dmini* fof. 
Ma for guidance centers. Short conferences were held NT of 
* anagers of Veterans Administration Regional Offices and Chie of 

ocational Rehabilitation and Education. The purpose 9 th ew 
ferences was to discuss policy and procedures described in 
Manual of Advisement and Guidance, to train personnel in t a] ced 


of some of the approved techniques and to discuss problems 5 
to the counseling of veterans. 


ВАБ 


THE CONTRIBUTORS 


D; Н. W. Bailey—Ph.D., University of Illinois, 1926. Acting 
irector, Student Personnel Bureau, 1938-39, Director, 1939-. 
eat Professor of Mathematics, University of Illinois, 1943-. 
Wilian Educational Advisor, A. S. T. P., STAR section, 1943-1944. 
: Ontributor to technical journals. Member, American College Per- 
Onnel Association, Mathematical Association of America, American 

athematical Society. Fellow, American Association for the Ad- 


= NO: 
ancement of Science. 


Irwin August Berg—Ph.D., University of Michigan, 1942. Per- 
sonnel Сила ог. М егп Electric Company, 1936-1939. Clinical 
ssistant and Teaching Fellow, University of Michigan, 1939-1942. 
с] chologist, State Prison of Southern Michigan, summer 1942. 
mical Counselor, University of Illinois, 1942-, Assistant Professor 
Sychology, 1944-. Author of technical articles on criminology, 


personnel, and tests. Member, American Association for the Ad- 
ncement of Science, American College Personnel Association, As- 
d Clinical Psycholo- 


56 i =. Ё " 
on of Midwestern College Psychiatrists an 


a, Py H. Bi io State University, 1942. Psycholo- 
Bist АГЫ ixler—M.A., Ohio State Y oe aul 
Seni ^ ston Child Guidance Center, 1943-1944. E re jo 


lor 1 
5 ounselor Student Counseling Bureau, rer 
Sta, 1945., Жог of oes in the Journal of Clinical Psychology 


a ; 
Фа the Journal of Consulting Psychology. Member, American Psy- 
iu Association. у. 1942 

Ro Mrs.) Virginia н. Bixler—M.A., Ohio State University, 7 
ТКЫ гаңоп Director, Summit Stark, ИЩ Me е 
Gon telas, Associations, Ohio, 1943-1944. Secre муу р i 
sul sociations, , К $, Minneapolis 
1944 1948" Committee Council x paal ee Minneapolis, 1945. 
песо, Winge s inical Psychology- Member, 


` . 
in sychological Association. мү! 
D University of Iowa, 1772. e- 
ergo, ^ISSIStant, University of Iowa, 198121 a “ 
Direct’ Thiel’ College, Greenville, NIA Dean and 
енш о Рато) Mas ore Morin Pitre Staton 
ang ` easure Students, one al 

° ological о ^fellom, American Psychological Associ 
159 


160 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ation, Sigma Xi, Member of the Motion Picture Research Committee 
of the Payne Foundation. 


William M. Gilbert—Ph.D., University of Michigan, 1940. Ex- 
change Fellow, University of. Hamburg, 1932. Teaching Fellow; 
University of Michigan, 1935-1940. Assistant Clinician, University 
of Michigan, 1937-1940. Clinical Counselor, 1940-1942, Assistant 
Director, 1942-1943, Acting Director, 1943-1944, Assistant Director, 
1944-, of Student Personnel Bureau, University of Illinois. Assistant 
Professor of Psychology, University of Illinois, 1944-. Author 0 
technical articles оп personnel and clinical psychology. Member, 
American Psychological Association, American College Personne 
Association, American Association for the Advancement of Sciences 
Association of Midwestern College Psychiatrists and Clinical Psy 
chologists. 

John О. Hershey—M.A., University of Pennsylvania, er 
Counselor, Hershey Industrial School, Hershey, Pennsylvania, 19 К 
Author of article, “A Mobile Occupational Library,” Occupation» 
The Vocational Guidance Journal, XXIV (1943). 


Jules D. Holzberg—M.S., College of the City of New York 
1938. Psychologist, Remedial Teaching Program, New York oie 
Schools, 1937-1939. Clinical Psychologist, Behavior Clinic, n ic 
vue Hospital, 1940-1941. Clinical Psychologist, Psychiatric 7 ical 
New York University School of Medicine, 1940-1941. o 
Psychologist, Mental Hygiene Clinic, Kings County and Моту r 
Hospitals, 1939-1941, Fellow, College of the City of New Yoy- 
1941-1943. Director of Boys’ Work, Federation Settlements 
1942. Clinical Psychologist, New York Committee on ee ty 
Hygiene, 1941-1943. School Psychologist, Westchester County 
Schools, 1940-1943. Psychological Examiner, U. S. Army; 
1944. Chief of Psychology Section, Assistant Chief of Spey 
Therapy Section, Instructor at School of Military Neuropsy¢l ical 
Mason General Hospital, 1944-. Member, American PsycholoE^ д 
Association, American Orthopsychiatric Association, New Yor 1 gist 
emy of Sciences. New York State Certified Qualified Psychol? 
and School Psychologist. 


"m 
. Helen Pallister—Ph.D., Columbia University, 1933. Assist 
ш Psychology, Barnard College, 1929-1931. Research Аѕ00 оу. 
Psychological. Corporation, 1933-1934. Research Associates ist, 
Colum University, Scotland, 1935-1938. School Psycho erot 
Columbia Grammar School, New York City, 1938-1939. Ins gist 
in Psychology, Barnard College, 1939-1940. School Psycho ginet 
uU ШЫ бшп School, 1941. Assistant Civil Service EX? Q5 
; 9. Civil Service Commission, 1942. Employee Counse Or part” 


Civil Service C iss: IP. A 
эз DESI. Trai Brea 


‚ыў 
Carl К. Rogers—Ph.D., Teach bia Universi- 
1931. Fellow, Institute for Child Cadena Ney Wark City, 19 


THE CONTRIBUTORS 161 


1928. Psychologist, Child Study Department, S.P.C.C., Rochester, 
New York, 1928-1930, Director, 1930-1938. Director, Rochester 
Guidance Center, Rochester, N. Y., 1939. Professor of Clinical 
Psychology, Ohio State University, 1940-1944. Director, Counseling 
Services, United Service Organizations, New York City, 1944-1945. 
Civilian Psychologist, Army Air Forces, 1944. Professor of Psy- 
chology and Executive Secretary of the Committee of the Counseling 
Center, University of Chicago, 1945-. Author of The Clinical Treat- 
ment of the Problem Child, 1939, Counseling and Psychotherapy, 
1942, and Counseling with Returned Servicemen, 1945. Author of a 
number of psychological articles. Member, American Orthopsychi- 
atric Association, American Association for Applied Psychology 
President, 1944-1945), American Psychological Association (Presi- 
dent-elect, 1946-1947). 


Howard C. Seymour—Ph.D., Harvard University, 1940. Assis- 
tant in Каана Harvard University, 1929-1940. Teacher and 
Guidance Counselor, Medford, Mass., 1934-1935. Superintendent 
of Boarding Schools. U. S. Indian Service, Santa Fe, N. M., 1936- 
1940. Director of Guidance, Rochester Board of Education, New 
York, 1940-1942. Co-ordinator of Guidance Services, Rochester 
Board of Education, New York, 1942-. Member, National Voca- 
tional Guidance Association, Phi Delta Kappa, and New York Asso- 


“lation of Applied Psychologists. 
k University, 1937. Teacher, 


George Spa Ph.D., New Yor k 
clementary ee ый high schools, New York City, 1930-1936. 
ЕУ chologist, Friends Seminary, New York City, and Brooklyn 
Sands School, Brooklyn, №. Y. Psychologist, Rye виа 
Reo, 1941-1944; Rye High School, 1943-1944. Psychologist, ап 
‘medial Teacher. Chappaqua Public Schools, Chappaqua, T i 
© Present, Lecturer, New York University Extension. Aut or о 
*'ticles on diagnostic and remedial work in reading and spelling, 
ch tligence cerita visual testing, etc. Member, American Psy- 

ological Association, New York State Association. i. 
Donald — ph.D., University of Iowa, 1934. Re- 
h Abos сыс к: and Director of the eae иш 
it Versity of Iowa, 1934-1935. Instructor in Psychology, Univer- 
x [ 1936. Professor of Psychology 
ex ersonnel, Hamline University, 1720. 
re turer КЕШ, Уз State Hospital for ee сыа 

7, 1937. Author of research and professional articles. Т ет en 
American Psychological Asociation, American College . кеспе 
Senp ciation, Midwestern Psychological, Aion kan ie B. 
j, member, Minnesota Society "Ph iye Council, 1942-1944 and 


ER 
1945 ont, 1941-1942; member o 


A versity of Chicago, 1932. High 
School Bar Е. Trexler.— Ph.D. Univer ansas schools, 1920-1928. 


rincipal and Superinten 


162 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Psychologist, University of Chicago High School, 1931-1936. Ке- 
ме pene By БНА Berard: Bureau, 1936-1938, ae 
Director, 1938-1941, Associate Director, 1941-. Summer and Wini 
time teaching, University of Chicago, University of Arkansas, a3 
versity of Alabama, Columbia University, Temple University, ES 
versity of California. Author of reading tests and textboo s Г 
use in the teaching of reading in junior and senior high sc UR 
Series of publications on measurement and guidance issue by a 
Educational Records Bureau. Author of a textbook on guidan 4l 
Contributor of articles to various educational and psychologicé 
journals. Member, American Association for the Advancemen ica 
Science, American Educational Research Association, Phi 765 
Kappa, Kappa Delta Phi, Psychometric Society. Associate, 
can Psychological Association. 


(Mrs.) Margaret Houston Wilson—M.S., Temple Univers 
1942. Employed by the Board of Public Education, Philadel Yor 
Pa., as teacher at the Gillespie Junior High School and as Сап т 
at the Northeast High School. Since December, 1942, Coordin rch, 
of the Self-Appraisal Program, Division of Educational Resea use 
Philadelphia Board of Public Education. Author of papers ™ к, 
with the program. Member, National Vocational Guidance 
ciation. 


l 


MEASUREMENT ABSTRACTS* 


Beal « ; P 
1, Goeffrey. “Approximate Methods in Calculating Discriminant Functions.” 


Psychometrika, X (1945), 205-217. 
three окпе methods of solving for discriminant functions have been tried on 
sum of im data. The principal illustration 15 the problem of finding a weighted 
tinguished res, on four psychological tests, so that men and women may be dis- 
Fisher, wh most clearly. The work starts from the complete solution, due to R. A. 

А ER ere it is necessary to solve as many simultaneous equations, dependent on 
tests. Tr ard deviations of the tests and their mutual correlations, as there are 
substitut d proposed, by way of numerical simplification, that a set of equations be 
obtain ited where some one quantity replaces all the correlations. A solution is 
tenes where the weights to be assigned the tests are very simply expressed in 

he standard deviations of 


de of differences between the mean values of tests, the : d 
arbitrary the said quantity. The difficulty remains of finding an estimate of the 
sis constant that will give good discrimination. If an optimal solution is made 
le fig Is obtained which, in the three sets of data considered, is almost indistinguish- 
commo m that yielded by the complete solution. The calculation of this optimal 
Suggested aa is, however, itself so considerable that another estimate, previously 
simply f. by R. W. B. Jackson, appears more profitable. This estimate 1s derived 
ity of rom the variability between the total scores for each subject and the variabil- 
calcu each test. Using this estimate, the discriminant functions can be rapidly 
With th, ed; the results compare very favorably, in the case of the data considered, 
Ose from the complete solution. (Courtesy Psychometrika.) 
Corsini "= Т 
» Raymond. “A New Method for the Administration of Individual Intelligence 
dest" Journal of Applied Psychology, XXIX (1945), 356-359. — j 
Minister; describes and evaluates the different ways in which, an pepe An 
г Individual intelligence tests, with special attention gv add moe а. 


ће 

ч y 

"xai! J Ct in relatio iner, to the placing of the test- Н 
п dpi т of scoring їп order to avoid undue 


„пат 
Curios; ion when not i d | ae 
ot in use, and to the mea i acp 
ing these problems 1s presented and ilu 
deb. d k, of the box of 


= 


tra Y Or suspici 
d wi spicion. A method o! 
mater; th а schematic di ;ng the location of the desk, 

atic diagram, showing the j: 
and 5 on an auxiliary table to the right of and parallel to the pull-out desk-leaf, 


f 
he two chairs. Vernon S. Tracht. 
Mua dE LL 


Concept of ‘Prejudice.’” Psychometrika, Ж 


ег, H 
(19427У S. “The Usability of the 


or Р 
Tab the Purpose of determining whether the trait co 


10176 in th тй co 
i с со icati representative samp 

клы а, to A diverse group of 20 judges who were 

dance with the amount O 


ni 
Te ith, 1 1 
repeated pads children were submitted pe 
is eudi rank 11 seri the responses in а 
nci еу were gee he usability, or usesva Mes of the concep: 

"i Tesseq td as the extent to which the judges can agree in their ratings 

null p, "terms of the average intercorrelation of such ratings. Tt an ia 
YPothesis of —0) is untenable. The data further sugg 
сю ‘Ot of eni кыз wed highest use-value in situations where the 
us prejudice" tends to have ^f (08 f serious social concern. 


( S at 
Сог Pholudice is commonly considered кке 
By Jychometrika.) 


* 
Edi 
ited by Forrest A. Kingsbury- 
163 


164 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Farnsworth, Paul R. “Attitude Scale Construction and the Method of Equal Ap 
pearing Intervals.” Journal of Psychology, XX (1945), 245-248. РЫР 
Eighty-five college students, after prejudging the items of the Thurstone- ete de 

Scale of Attitudes Toward War, were asked to indicate whether or not they P He 

value E as exactly half-way between D and F. Ninety-six other college su te 

were given the task of locating C on a rating scale continuing from A en 

where B represented neutrality and A extreme pacifism in half the cases and ex 25 

militarism in the other half; and fifty-three remaining subjects іп the group, P 

sented with a rating line on which only B was located, were asked to locate, = i 

C, with A representing extreme militarism in half the cases and extreme pac! s ANE 

the other half. Results obtained support the thesis that the method of Shee 

appearing intervals does not always provide a realistic frame of references, P. n 

many judges do not think in terms of equal units or of a straight-line cont! 

with a middle neutral point. Frances Smith. 


E ` З * А та! 
File, Quentin W. “The Measurement of Supervisory Quality in Industry.” Jo“ 


of Applied Psychology, XXIX (1945), 323-337. er scribed. 
The construction of a test of supervisory quality, “How Supervise? Is фегусеп 
The items are keyed by the consensus of expert judgment. The correlation ainly in 
the modal responses of two groups of experts is .91. Validity is discussed m? pie 
terms of reliable measurement of areas considered important by experts. ervisors 
liability is estimated as .84. Top management rating of good and bad sup 
is evaluated and rejected as a criterion of validity. S. M. Roshal. 


- of 
Fiske, Donald W. and Dunlap, Jack W. “A Graphical Test for the 5їапїйса x 
Differences Between Frequencies from Different Samples." Psychom? 
(1945), 225-229. ; different 
For testing the significance of differences between frequencies from Joped 9? 
samples, an ellipse can easily be constructed on the basis of a formula deve Оре ате 
the assumption that both observed samples are random samples from EE te 
parent population and that the best estimate of the true proportion is, the od for 
mean proportion of the two samples. The ellipse provides a very rapid me 
testing pairs of frequencies. (Courtesy Psychometrika.) 


Tests 
Garrett, Henry E. “Comparison of Negro and White Recruits on the Army "495. 
Given in 1917-1918.” American [om of Туру, LVIII (1945), BO ar 
Army test data as presented and interpreted by M. F. Ashley Montague? © oq t 
ican Journal of Psychology, LVIII, 161-188) are commented upon with Тете m 
method of comparing Alpha and Beta medians for Negro and white so! ast 
World War I. It is contended that comparison in terms of a combined sea partial 
on stratified samples of Negro and white soldiers gives a more accurate Могай 
appraisal of racial differences than that offered by Montague, and that ЖЕ ОБ 
thesis that the racial differences exhibited are explained by socio-econom!c 
not borne out by the test data. Frances Smith. 


; : d 
Griffiths, George R. “The Relationship Between Scholastic Achievement an о! 


A : 4 ‘ho 
sonality Adjustment of Men College Students.” Journal of Applied Psyc? 


XXIX (1945), 360-367. a signifi 
The problem undertaken is to determine whether ог not there is 2 5 5 base 
relationship between personality adjustment and academic achievement. rivers 
upon the results of the Bell Adjustment Inventory administered to Ohio veve cd 
freshmen. No statistically significant relationships appear. Results, hov nt 2" 
suggest some degree of positive correlation. between scholastic achieve 
personality. Leroy S. Burwen. 
” 
est 
Gurvitz, Milton S. “An Alternate Short Form of the Wechsler-Bellevue A 
American Journal of Orthopsychiatry, XV (1945), 727-732. ewisb¥" 


A statistical survey was made at the United States Penitentiary at L 


MEASUREMENT ABSTRACTS 165 


i» under the direction of Dr. Robert M. Lindner, to determine which subtests of 
e Wechsler-Bellevue Scale combined high predictive value with simplicity and 
minimum time requirements. The Digit Repeating Test and the Picture Arrange- 
ment Test. were chosen as giving weighted scores lying nearest the mean of the 
subtest weighted scores. This short form was found to be more discriminating than 
the Rabin Short Form, particularly in the IQ range from 40-70, and showed a 
Correlation of .90 with the full scale in 523 cases from a heterogeneous population. 
rances Smith. 


Hall, W. E. and Robinson, F. P. “An Analytical Approach to the Study of Reading 
Skills." Journal of Educational Psychology, XXXVI (1945), 429-442. 

his study is an expansion of previous factor analyses for determining inde- 
Pendent reading skills and the tests which best describe them. Several new tests 
are added which describe other aspects of reading and make the determination of 
„actors more reliable. One hundred students of freshman English at Eastern Wash- 
mgton College of Education were given Robinson and Hall's nonfiction tests in 
Beology, history, and art; Pressey’s Dictionary Test; Robinson's test on table reading; 
and some specially constructed tests in reading charts, diagrams, and maps. Six 
actors were isolated in the types of reading accuracy situations studied, one impli- 
cation being that prose and nonprose materials require different reading skills. 


ernon S. Tracht. 


Horn, Charles A. and Smith, Leo F. “The Horn Art Aptitude Inventory.” Journal 
of Applied Psychology, XXIX (1945), 350-355. е 
he test was designed for the assessment of quality of line, appreciation 0! Pip: 

Portion, compositional sense, scope of interests, fertility of imagination, and the 

ability to depict ideas pictorially. The three tasks presented in this test are scored 

in terms of the “goodness” of the productions While the scoring is subjective, 
porrelation coefficients, ranging from .79 to .86, are reported for the relationship 
“tween the scores assigned by a layman and those assigned by a member of the e 
aculty. Validation against faculty ratings of success yields coefficients of ER 

66 for samples of 52 and 36 respectively. The test was found to be a better pre е 

sl Success in art school than the A.C.E. Psychological Examination. S. M. Roshal. 


Hoyt, Cyril J. "Testing Linear Hypotheses Illustrated by a Simple Example in 
Correlation," Psychometrika, X. (1945), 199-204. ignificance of a corre- 
€ development of a criterion suitable for testing the ogi Saher in which a 
г Ог regression coefficient is used аз ап illustration of the mann асаа 
cacttch problem is bound to the selection of the particular aa AS of the 
сосе and a fitting type of statistical analysis of the latter. rd ems OR 
th ginal inquiry into a problem of “testing linear hypotheses. ds De enon is offered 
29880 two aspects of an investigation are held together. This presento 18 оосор 
día Plan which might be useful for some research workers in meted m р 
Titeria for testing their particular hypotheses. (Courtesy Psychometrika. 
———_ 


Humm, D. G. “Sidelights on the Use of Intelligence Tests.” 


Psychol 28-233 1 i 
an emphasising thas ostia: t as a whole pem poene с 
dits ao SA piam, the Кар of disposition and emotional ETE 
at m; elligence to reveal any, ld be given a minimum 0. 


Я bject shou! 5 d 
2 in]? t affect mental manipulations. hen et ошо е -and during thelr 


I&ence tests—preferably. bl e difficulties. These 
ministrat; D › for possible ey i 
iue Moe poit uet pe pee for statistical implications, will make such 
Sts more uut and meaningful. Vernon S. Tracht. 
L—— 
Ja ПИР" 
т К.Е. “On the Permissible баш of Grouping. 
S¥chol 5), 385-325. : mean are 
The author 5 Е the effects of grouping errors upon the 
› 


lation 


Journal of Consulting 


Journal of Educational 


166 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


understood to be “unsystematic” (1.е., they average to zero), it is not so generally 
recognized that the variance of a sampling distribution of means is larger when s 
puted from grouped rather than ungrouped data. He outlines in simple SES i i 
construction of a statistical method for determining in advance the number of с RS 
intervals to use in satisfying a criterion. Thus one can decide whether his ейин i 
grouping are too high from the standpoint of the “level of confidence” one У 
tolerate in the light of the data at hand. Vernon S. Tracht. 


Levi, J., Oppenheim, S., and Wechsler, D. “Clinical Use of the Mental Deteriorate 
Index of the Bellevue-Wechsler Scale.” Journal of Abnormal and 50 
Psychology, XL (1945), 405-407. s of 
By employing the measure of difference in scores between two EEOUP not 

Wechsler-Bellevue subtests, those which hold up with age and those which o БЕ 

hold up with age, an index of intellectual deterioration is obtained which mT ché 

indicative of abnormal impairment of mental functioning. The index is given bY 
Hold - Don't hold 


T " indicator 

formula === 7 and a loss in excess of 10% is suggested as an indi : 
а " h А "e А ; i ering, 
of possible impairment. Cases cited show that this index is useful in derre er 
as well as in confirming, organic conditions. Emphasis is placed on need OF Frances 


experimentation, including use of control groups and statistical refinement. 
mith, 


А iple 
Malamud, R. F. and Malamud, D. I. “The Validity of the Amplified Multipl 
Choice Rorschach as a Screening Device.” Journal of Consulting Ps 
IX (1945), 224-227. lidit 
This amplified test, devised by Harrower-Erickson to improve the Уй! ivi 
her original version through modifications in form and scoring procedure, e Бате 
individually to 100 normals and 100 abnormals to determine whether it discrimi for! 
between these groups. The results were negative, indicating that in its prese : 
it still is not a good screening device, and cannot be depended upon as a dif 
ating instrument. The authors suggest a number of improvements which w! 
its validity and make it self-administering. Vernon S. Tracht. 


seal 
Myklebust, H. R. and Burchard, E. M. L. “A Study of the Effects of Contre 
a 


and Adventitious Deafness on the Intelligence, P ality, and Soc! Ма 
of shoo! Children.” Journal of Educational pore Ы XXXVI (1945), 
Ta in in- 
_ This study sought to determine whether significant measurable differences +0 nd 
telligence, social maturity, and personality existed between children born de? yere 
those whose deafness was acquired after speech had developed. Comparison ntir? 
made from results on the Grace Arthur Performance Scale, administered to © el 12! 
group of 100 males and 89 females (68 of whom were adventitiously ales in 
congenitally deaf), on the Haggerty-Olson-Wickman Behavior Rating Sche in ta^ 
87 cases, and on the Vineland Social Maturity Scale in 104 cases. hile ables 
tistically reliable differences between the groups were found in the three Va псі 
both groups were retarded in social maturity and evidenced maladjustment ter 
when compared with norms for the nonhandicapped. Vernon S. Tracht. 


: inant 
Postman, Leo and Zimmerman, Charlotte. “Intensity of Attitude as а Бегет 518 
of Decision Time.” American Journal of Psychology, LVIII (1945), suring 
] Twenty-eight subjects were administered a Thurstone-type scale 101, mea г No 
attitude toward the Catholic Church. The time required to make а > * 
response to 20 statements was taken, after which the subjects indicated the intr ous 
of their acceptance or rejection on an ll-point scale. The results verifie Pr yale 
evidence that decision-time becomes longer as the border of 2 ranges 9 sity 
stimuli is approached, thus showing it to be a systematic function of inem nio? 
attitude. The experiment's practical application in regard to polls of public 
is pointed out. Vernon S. Tracht. 


ща 


MEASUREMENT ABSTRACTS 167 


Simpson, R 

.G Зд Di е 
Жы Bui нр; List of Spelling Wi 
To sl он Psychology, E AC ed Freshmen." 
End жен abes iul need of a satisfactory list of spelling ' 
and analyzed y need E quens eia Ml E eee ine ords for. fis. Ш 
Students ас Carnegi 1 most frequently misspelled words in th entally compiled 
original number. нЕ e nstitute of Technology over a 5-yea e written work of 
crucialness and w /5 were finally selected on the basis of high See From this 
word being left уегеппсорор ай in an outline-word test, the peed spe of, — 
tion was found t ank for the student to fill in. This outline form of Spo in each 
E js suggesting оли favorably with the dictation method m ot wora BE 
aluable sile: it the outline method, i Ы =.90, Р.Е. =. 

le silent spelling test. Vernon 5. d, 1E properly developed, can be made a 


Stalnake 

r « 

‚ John M. “Personnel Placement in the Armed Forces." Journal of Applied 
1. 


P. d 
A Thee tology, “eon (1945), 338-345. 
fhe armi e discusses the advantages of effici i 
med forces, and some of the methods and cieni menos dig Stan s 
" a urtoen. 


Sward, Kei « 
; Keith. “Аре and Mental Ability in Superior Men.” 


Ps 
Уо, түш (1945), 443-479. 
tests, w idual mental test consisting of eigh i 
a lua x 5 ght subtests, tv 
e Ч уоона to 45 university professors age Tus en coo uet 
y mean academic men aged 25-35. Test scores of the two 
young opor C R calculations, show in six tests a significant guys em 
median A ave the Synonyms-Antonyms 
erences the young. Individual differences are 
an нег gy losses” аге interpre i 
cate char of the particular test emp 1 
Processes? at an within the upper ranges of ability impairment of "higher mental 
is by no means an invariable concomitant of age. 
— 


in Factor Analysis." 


(2945), 165-193 

S s ce rnm are affected by selection of subjects and by selection of tests. 

the’ already 18. е addition of one ог тоге tests which are linear combinations of 

linea ven Hila "m attery causes the addition of one or more incidental factors. 

numb сор си reveals a simple structure, the addition of tests which are 

ter eer of іп попа of the given tests leaves the structure unaffected unless the 
minat cidental factors is so large that the common factors become inde- 


e. 
(Courtesy Psychometrika.) 


American Journal of 


Psychometrika, X 


hool Intelligence Quotients 

cational Psychology, XXXVI (1945), 443-446. 

by educators an counselors for a better under- 
"tude test scores, the author 


Web, 

Sty: E. « 
est e Equating High-Sc 
зал Оп res." Journal of Edu 
do, ding © o the need expressed 

vibes а e meaning of IQ's in terms 
Tay; grad, е ПОЧ of equating both. One thousand University of Л s 
P ts of protes Of Minneapolis High Schools, were given the Otis Quick-Scoring 
саша ол ental Ability, and the Psyc ‘American Council on 
ted p, үр сы Scores, мего transposed into standard scores and these in turn 

ts of Pathe Standard Deviation Linear Technique. chart illustrates the linear 
s and ACE scores and the manner of reading one from the other. 


racht. 
Groups During the 


f Psychology, 


roup IQ changes 
hool children, are 


reschool 


hool and Nonp 
Present » Journal О 


f the Literature. 


on distribution of g 
nd for nonpresc 


man 

Pres eth І. “IQ Changes of 

(1945 "omm А Summary 0 

"1 e fin т —. k 

mg the wines from about 50 references, 
Teschool years for preschool children 2 


f 
168 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


collated and tabulated to form a rebuttal of criticisms that only Iowa investigates 
obtain increases in these groups. Twenty-two preschool groups (1,537 children), 
tested on three forms of the Binet, made a mean group change of plus 5.4 points; 
Fourteen nonpreschool groups (597 children) made a mean group change of Pool 
ll point. On the Merrill-Palmer Scale the mean group change for RIES sk 
children was plus 13.0 points; for the nonpreschool children, plus 6.2 points. DU 
are also noted on the Gesell, Minnesota, and California Preschool Schedules. б” 
ville C. Fisher. 


Wilson, Guy M. and Staff. “Adapting the Minnesota Rate of Manipulation Test to 
Factory Use.” Journal of Applied Psychology, XXIX (1945), 346-349. Test 
In order to save time’in the use of the Minnesota Rate of Manipulation A , 
the author suggests the use of the low score of three trials instead of the sum Oot i 
scores of four trials. A correlation of .968 between these two scoring methods 
sample of 63 subjects is reported. S. M. Roshal. 


» 1 
Zipf, George Kingsley. “The Meaning-Frequency Relationship of Words." Jours? 


of General Psychology, XXXIII (1945), 251-256. ual to 

The author states that “different meanings of a word will tend to be eq uan- 
the square root of its relative frequency.” This conclusion was reached after a the 
titative investigation of E. L. Thorndike’s list of 20,000 most frequent words OF ose 
one hand, and the actual number of the separately numbered meanings O^, 
words as given by the Thorndike-Century Senior Dictionary on the other. Dr. 
states further, “There is no reason to suppose that in making the dictionary» o. 
Thorndike selected for each word a number of different meanings that Ad 
portionate to a power of the word's frequency." This article, however, md be 
indicate that Thorndike by skillful empirical methods achieved results that m 
analyzed mathematically. Gustav Dunkelberger. 


bn 


DIAGNOSIS IN COUNSELING AND PSYCHOTHERAPY 


EDWARD S. BORDIN 


University of Minnesota 


IN the last ten years there has been considerable ferment in 
the thinking about counseling and psychotherapy with normal 
dividuals. This period has been marked by great strides 
toward converting an unverbalized art to a carefully delineated 
Practice based upon the results of empirical studies. Books and 
articles have been published which dealt with concrete descrip- 
tons of practices and which presented theories of treatment. 

Within the groups turning toward more definitive discus- 
Slons and descriptions of treatment, two somewhat divergent 
Points of view have been discernible. Rogers and his students 

ave been the primary source for the presentation of a non- 
Irective theory of counseling and therapeutic procedures, and 
llliamson, Darley, and more recently Thorne have been the 
Most vocal exponents of conceptions which have been labeled 
te by the first group. Williamson (11) made a pioneer 
. tribution by presenting а rich compilation of the kinds of 
individuals with whom the student personnel worker will deal 
zad the procedures he might use in attempting to aid them. 
ers (8) has contributed an integrated description of a treat- 
"nt process Further, he has distinguished his treatment as 
Ondirective and has questioned the validity of directive 
e ods used by personnel workers and others concerned with 
Ndividualized treatment. while conceding the 
“tribution of nondirective tec 
bae the only method which h À 
ribe situations in which directi 


$m 
Ore effective Р | 
| ner is faced with а choice of 


hus the psychological practitio Ч 
‘Teatments He is re with a choice which will be difficult to 


169 


170 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


make unless he is already prejudiced in favor of one or the 
other. He is faced with a difficult choice whether he is unde- 
cided as to the relative validity of the two or accepts Thorne’s 
thesis that they are not incompatible. In the latter instance 
he still must decide the proper time to use each one. 

Before this decision can be made in an adequate and final 
manner, there is still one more element to be added, namely, 
diagnosis. There can be no completely definitive demonstra- 
tion of the differential validity of treatment without knowledge 
of what we are treating. True, one could say that we are treat- 
ing human dissatisfaction and unhappiness, but could the great 
strides in medical therapy have been made if medical scientists 
and practitioners had been willing to stop at the level of diag- 


nosing patients as such? Guthrie makes the same point when 
he says: 


It (psychotherapy) must be restricted to those efforts at 
treatment which are consciously (in so many words) based on 2 
knowledge of the ways of the mind, those treatments in which we 
are aware of the psychological explanation of the distress and the 
poires of adaptive habits we are establishing as a cure, 


. _We must be able to distinguish the behavioral characte!” 
istics which will accompany one type (source) of dissatisfactio? 
from those that will accompany another type of dissatisfactio" 
From classifications based upon specific sets of characteris? 
we must be able to predict other characteristics which wil 


found either at the same time or with the progression of on 
as, for example, by knowing the species of a bird we are able 
predict its mating behavior, its migratory habits, etc. ln | n 
Way we can set the stage for the most important predicti e 
from the practitioner’s standpoint, that is, the prediction of d 
effect of one treatment as compared to another (or as comp? 
to no treatment), ӯ 
It is the purpose of this paper to explore the diagnost© со 
cepts which have been used and to attempt to contribut? ich 
ward the development of a series of diagnostic constructs E н 
will make possible definitive studies of treatment hypothe 


А A : t0 
Since most counseling and psychotherapy is being directed 


COUNSELING AND PSYCHOTHERAPY 171 


ward the psychological problems found within the normal range 
of individuals, and due to limitations of the writer’s own ex- 
Perience, the constructs developed will have primary reference 
to problems as they appear in counseling and guidance services 
1n colleges and universities and other educational institutions. 
Both Williamson and Rogers have tended to address them- 
Selves to this type of setting. 


Desired Characteristics of Diagnostic Constructs 


It has been suggested that substantial progress in the vali- 
dation of psychotherapeutic treatment processes cannot be 
made without the postulation and validation of constructs or 

Causes? of psychological problems. Let us consider the char- 
acteristics by which a potentially valuable set of diagnostic 
constructs can be recognized. 

1. One of the most important characteristics of such a con- 
Struct is that it enables the clinician to understand more clearly 
the significance of the individual's behavior. For example, this 
«nd of understanding would appear to play an important role 
3h the therapist's ability to respond adequately to feelings ex- 
Pressed by the client in a nondirective treatment process. 

lagnostic constructs should sensitize the clinician to respond 
"d Significant characteristics of the client's behavior that might 
therwise have been overlooked. The degree of understanding 
estered by the constructs will be reflected by the comprehen- 
Slveness of the predictions which can be made about the indi- 
Vidua] by assigning him to a class. This is the operational sig- 
ificance of understanding. We perceive a distinctive and 
‚Шаг pattern which is part of a larger pattern the character- 
tics of which are then predictable from our perception of the 
This is the secret of the medical diagnos- 
om a few symptoms he is able 
In fact, he checks his diagnosis 
] symptoms do conform to 


Haller pattern. 
у A's success, namely, that fr 
Predict the other symptoms. 

Seeing whether the additiona 


e 
| “*Pectation. 
2. The more a set of diagnostic constructs vary indepen- 


“ » 

“ntly, the closer they are assumed to be to the status of “true 
“Uses and the farther from the status of surface symptoms. 
at is, the more independent a set of constructs, the more 


172 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


sharply focused the prediction yielded. If, for example, fever, 
coughing and sneezing, blood counts, skin condition, etc., Were 
used as basic constructs in the medical field, it would soon be 
found that they do not vary independently—that they form 
patterns—and that the predictions provided by any one con* 
struct are very limited. The medical practitioner would ex- 
plain to us that these characteristics do not predict much be- 
cause they are symptoms, not causes. To state it another way, 
a set of constructs based upon the patterns of these limite 

classifications will provide a basis for a more comprehensive 
set of predictions. From this point of view the most desirable 
statistical characteristic of a set of diagnostic classifications I$ 
that they vary not only independently but are also mutua y 
exclusive. However, we could no more expect this than b 
should expect that there will be no individuals who have 
measles and whooping cough or any other combination of dis- 
eases at the same time. By setting a criterion of statistic? 
independence we ask only that various combinations of cate 
gories do not occur more frequently than would be expect? 
by chance. We can become most suspicious of the compr? 
hensiveness of a set of categories when we find greater tha 

chance incidence of combinations of three or more of them- f 
| 3. From the theoretical as well as from the applied point p. 
view, but particularly from the latter, the most vital characte" 
istic of a set of diagnostic classifications is that they for™ к 
basis for the choice of treatment. This means that there § e 

be some understandable and predictable relationship been ii 
the characteristics which define the construct and the $5 
of treatment processes. From the therapist's point О е7 
diagnosis will be of little value unless it points to trea ude 
Part of the definition of a diagnostic construct should 10° nd 
some statement as to how the condition can be modifies : 


its validity will depend in good part on whether this predic 
can be verified. 


tme 


Present Status of Diagnosis ol 


t 
| In the area of normal psychological problems the conce? gg 
diagnosis presented above has been used rarely. Rogers 


COUNSELING AND PSYCHOTHERAPY 173 


the question of diagnosis, but he does so as though there was 
om one possible type of interview therapy. He confines his 
iscussion to listing two sets of criteria, one for the use of treat- 
ment by manipulation of the environment and the other for 
determining whether the individual can take interview therapy. 
For a long time there has been current among counselors, 
Working in the educational and vocational guidance setting, 
terminology for describing their clients’ problems which cen- 
tered around the difficulties about which they complained. 
Williamson and Darley (12) and later Williamson (11) devel- 
oped these ideas into an attempt at a systematic set of diag- 
nostic categories. Only a summary will be presented with no 
attempt to reproduce Williamson’s extensive description of the 
five suggested categories: 
Personality Problems.—lnc 
Culties in adjusting in social groups, speech diffic 
Conflicts, and infractions of discipline. 


Educational Problems.—These include unwise choice of 


Courses of study and curricula, differential scholastic achieve- 


Ment, insufficient general scholastic aptitude, ineffective study 
abits, reading disabilities, insufficient scholastic motivation, 
9verachievement, underachievement, adjustment of superior 
Students. 
Vocational Problems.—Descriptive subdivisions of this cate- 
8ory are uncertain occupational choice, no vocational choice, 
IScrepancy between interests and aptitudes, unwise vocational 
Choice, 

Financial Problems.—lhese include 
the need for self-support in school and 
ated questions of student placement. жй 

Health Problems.—This category refers to the individual’s 
adjustment to his health or physical disabilities, or both. | 

Examination of this diagnostic system indicates that pri- 


marily it represents an attempt to describe the individual in 
: he demands of his environment. 


he aspects of his social environ- 
ment with which he appears to be unable to cope to his satis- 
(which assumes eventual 


luded in this grouping are diffi- 
ulties, family 


difficulties arising from 
college and the corre- 


174 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


dissatisfaction for the individual). This type of description 
might be termed a sociological description of the individual to 
distinguish it from a psychological description of the individual 
which starts at the individual describing the organization of his 


ll 


behavioral characteristics and predicting what his reactions wi 
be to his social environment. 

Let us consider the adequacy of these sociologically r 
diagnostic classifications by applying the criteria suggeste 
above. 

First, do they point the way to treatment? Since William- 
son does not attempt a clearly structured description of treat- 
ment processes, the answer must be inferred from his discus- 
sions of specific procedures in specific situations. Such analysis 
leads us to the conclusion that treatment is not indicated by the 
problem classification but by other factors. Williamson ‘does 
state that “the effective counselor is one who adapts his tech- 
niques of advising to the personality of the student” (11: P 
138). Some individuals who present vocational problems or 
educational problems or financial or personality problems might 
be helped by giving them information. Yet others who presen 
difficulties in the same areas must be dealt within terms of their 
feelings. Thus, the assignment of the individual's difficulties 
to one of this set of classes of difficulties does not provide 2 basis 
for prediction of the relative success of different treatments: — 

Second, to what degree do these classifications vary a de 
pendently? To answer this question there are data availab Ё 
on some two thousand cases who came to the Student Couns 4 
ing Bureau at the University of Minnesota, between 1932 Й, 
1935: These cases were classified according to the above dia 


ooted 


. . . . . i 
nostic system. The resulting distributions showed 2 e 
degree of patterning in the occurrence of the problem оа 


For example, there was only one category, vocational pro 


: i xi 
which exhibited any appreciable occurrence by itself. ApPr y 
mately twenty-three per cent of the total number of indiv! ext 
were classifiable as having only vocational problems. The Pot 


highest occurrence of a single problem was only 1.6 per of ines 
educational problems. Similarly, the distributions of com 


1 Taken from an unpublished report by E. С. Williamson and E. S. Bordi?- 


COUNSELING AND PSYCHOTHERAPY 175 


Е ги da — were far removed from what would 
tion of two ао — и нес me Hele 
E 2 ational-educational which was 
presented by 27.7 per cent of the total population as com- 
pared to the next highest frequency of 5.8 per cent for the 
combination of vocational and personality problems. Similar 
non-chance distributions are found in the occurrence of combi- 
nations of three and four problems. Further, there were more 
individuals who presented all five problems (1.1 per cent) than 
there were individuals who presented single problems of either 
financial (0.2 per cent), personality (0.2 per cent), or health 
(0.0 percent). These results would appear to suggest strongly 
that there is a deeper level of analysis than is represented by 
these categories. It suggests that these categories would appear 
in the relation of surface symptoms to a set of categories repre- 
Senting a deeper level of analysis. 
What of the third criterion, the amount of understanding 
conveyed by the classification, that is, its predictive value? 


пе same study, cited above, produces data on this question. 
t was found that various characteristics of the individuals were 
sifications, except for 


n . 5 
En: predicted so much by the single clas 

Vocational, as by various combinations. In other words, again 
F looked as though there was some more basic classification 
Which might be somewhat reflected by the present ones. 
f Diagnostic Constructs 

resented above indicated that 
lassification far from fulfilled 
stic categories, the writer 


A Suggested Set o 


th Because analysis of the type P 
€ present system of diagnostic € 


t : B Ң 
he desired characteristics of diagno e 
elt it necessary to search for some more adequate system. Wil- 


lamson's treatment of these categories seems to reflect a recog- 
nition of their incompleteness and offered one useful source of 
Mspiration. For each category and the subdivisions of it he 
Blves considerable time to а discussion of the causes of the prob- 
em. Here much of his analysis is at the psychological as well 
as the sociological level. In other words, he considers the 
Organization of the individual's life history which leads him to 
IS present status and its significance for other forms of be- 


avior, 


176 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


This source and others were consulted, but the main basis 
for the set of diagnostic constructs which will be presented 
below is the actual observation of clients over a period of about 
six months. As each client talked about his difficulties in mak- 
ing a vocational decision, or about the fact that he felt that he 
needed help in working out a method for financing his educa- 
tion, etc., the writer asked himself, and attempted to answer, 
the questions, “Why cannot this individual work this thing out 
himself? What is stopping him from being able to find a satis- 
factory solution? How is he different from his fellow students 
who appear to be facing the same problems and working them 
out successfully for themselves?” Certain types of answers 
began to appear. They were answers which suggested ways 
in which the client could be helped. They were answers that 
gave the counselor the feeling that he could predict how the 
client would react to various possible verbal stimulations. 
They were answers which seemed to have antecedents in other 
psychological observation and experimentation. 

Having considered the method of search, we are ready t? 
look at the resulting diagnostic constructs. Р 

Dependence.—This concept is common currency in child 
and adolescent psychology where it is usually discussed under 
the rubric “psychological weaning.” The client comes to the 
counselor for help because he has not learned to solve his oW” 
problems. The client is used to playing a passive role. He has 
been dependent upon his parents or parent-surrogates to solve 
his problems for him. His progress beyond the infant stage 1 
reflected by the fact that he has learned how to ask for help 
more explicitly and is more discriminating as to where he directs 
his requests for aid. Usually he has come to the counselof 
because someone has taken the responsibility to suggest it- ^ 
counselor will find that this type of client resists accepting 
responsibility. He will be anxious to continue his contact Wi" 
the counselor. If given the opportunity, he will wear a pss 
to the counselor's door, coming in for help with every decisio" 
that faces him: how to plan his time, how to find a part-tim? 
job, whether to take Psychology this quarter or wait until next 
The unwary counselor will feel that he has established a £0? 


COUNSELING AND PSYCHOTHERAPY 177 


relationship (rapport) with this cli i 
m ji a fostering the further Mc acis of lapom. o 
ue m (from either the social or individual 
de E. reatment of individuals presenting this kind 
asc sone to include aid in insight and accep- 
ether th E єє, do feel inadequate to cope actively 
us ep у with gd everyday problems and aid in ob- 
kim Se pres that will enable them to work out their 
ein E dd their problems for them will 
D rin state which will bring them back to the coun- 
rd someone else as each new problem presents itself. 
E E о early stages, but after the client has gained insight 
Selor to lependent feclings, 1 may be necessary for the coun- 
partially guide the client as he makes his first tentative 


Step 

S toward ; : ; 

е kee Ward independent action, much as, at earlier stages, 
P youngsters from harm as they learn to cross streets by 


t €mselvegs. 

whieh ak of Information—Many indi 

Who w eir experience has not prepared them. The individuals 

Vidua] ould fall in the lack-of-information category ате indi- 
5 who are used to accepting the responsibility for making 


ег o 
n decisions, but who face a decision involving informa- 
Im of their experience. Ina 


lon е 
Bes а skills out of the rea 

i ity that draws students from small rural schools there 
details many such individuals, bewildered by the organizational 
М of a complex educational instrument or by social cus- 
: tee to their experience. These individuals lack the 
п Unities to compare themselves with representative groups 
| i ing abilities, 


essa 
telan: TY to accurate judgments а 


viduals face situations for 


to achieve social goals. 
d ignorance, he 


y also arise as a function 


The types of lack of 


s 
a 
Q SO Ж a 
f re. 50 recognize that ignorance ma 
rise from all 


i Fiction ; А 
nfo Ction in opportunity to learn. E А 
entione can 


ma : 
Чоп which have been m 


178 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


types of environmental restrictions in experience which make 
this ignorance plausible. He needs to beware of excessive ignor- 
ance or unusual combinations of ignorance which is insufficient 
to account for the perplexity displayed. Yet, if he is working 
in a situation where large proportions of a student body are 
aware of the counseling service, a sizable proportion of the indi- 
viduals who come to him will be classifiable as lacking informa- 
tion. The treatment of such individuals would appear to be 
quite direct. They should be given information, referred to 
books or other individuals, and so on. Where the individual 
is seeking to avoid responsibility care must be exercised to avo! 
giving him the information in such a manner as to foster his 
potential dependence. 

Self-conflict.—The fact that there appears to be sharply 
differentiated organizations of individuals’ behaviors towar 
themselves as stimulus objects has been receiving renewed and 
extended attention in the recent psychological literature. This 
factor has been discussed under the topic of ego, by Allport 
(1); ego involvement, by Edwards (3, 4), and Wallin (10; 
role and self, by Guthrie (5); and self-concept, by Raimy ( 
and Bordin (2). From clinical observation it appears that 
many of the obstacles in the individual's ability to cope W! 
his problems arise from the conflict between the response func- 
tions associated with two or more of his self-concepts or betwee? 
a self-concept and some other stimulus function. Guthrie takes 
a similar position when he cites the “conflict between role an 
actuality” as a source of students’ breakdowns. He cites 25 an 
example: 


a docile girl who received good marks throughout grade and high 
school. Modern schools grade their pupils according to effort 
and docility and not according to actual achievement. . - - When 
she reaches the university there is keener competition and more 
objective grading. As a result she manages to receive only 
average grades in spite of increased effort. She cannot гесопсгё 
herself to average grades or face her family where her record has 
always been a matter of pride and comment. She begins to ose 


SA dei m despondent, to find herself unable to study 


anne e -— 
The description is a familiar one. It has been duplicat? 
in the experience of most college clinicians. In addition to suc 


M 


COUNSELING AND PSYCHOTHERAPY 179 


familiar instances of conflict between a self-concept and the 
ability to behave in a manner consistent with that self, there 
are instances where two self-concepts come into conflict. Take, 
for example, the instance of the son of a doctor who has devel- 
oped considerable identification with his father. Through the 
years they have shared many activities, hunting, building 
motors in a shop, attending athletic events. But the activities 
shared were not necessarily those intimately related to the prac- 
tice of medicine. The development of the son’s experience is 
such that one of his dominant self-concepts is that of a forester. 
At the same time, the son’s close relationship with his father 
and his father’s evident desire for him to become an M.D. 
makes for a competing picture of himself, but one which is not 
aS closely allied as forestry to the majority of his behavior 
Patterns. The basis for conflicting motives is largely unverbal- 
ized. The student can only say that he cannot seem to make 
up his mind as to what to do. 
The nondirective treatment process described by Rogers 
(8) appears to apply most completely and most directly to e 
type of psychological problem. It can be assumed that indi- 
Viduals presenting problems of self-conflict must be aided to 
Tecognize and accept their conflicting feelings before they will 
€ able to arrive at the positive decisions involved in resolving 
the conflict, 
Choice Anwiety.2—In 1941-42 when these a нев 
cing formulated, large numbers of students in pies т ip 
Versities were grappling with the problem of their relationship 


to the national emergency- This was the period of the ee 
nlisted Reserve Corps, the Navy V3 and V-12 ен ап | 
the deferment of students in certa! scientific and tec ae 
fields, "The nature of the psychological problem represente А 
the students who came to the writer with their quandary can be 
"épresented by an analogy to the experimental neurosis experi- 

In these experiments rats were 


Ments r " (6). 
eported by Maier 

trained b ; үе a platform toward the correct one of two 
ie iminated, it swung 


—orways. If the correct doorway was discr 
i d to Mr. Harold Pepinsky for the suggestion of the name 
Оо . 


for te writer is indebte 


1$ category. 


180 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


open as the animal hit it and a reward of food followed. If the 
wrong discrimination was made, the door did not swing open, 
the animal bumped its nose and fell into a net below, presuma- 
bly a very dissatisfying experience. After the animals had 
learned to make the correct discrimination, experimental neu- 
rosis was induced by the punishment of either choice. Maier 
noted that not all of the animals developed neurotic behavior. 
Those that may be said to have accepted their plight, as evi- 
denced by abortive jumping, did not develop neurotic symp- 
toms. On the other hand, those animals that continued to 
“expect” to find a rewarding choice were the ones that di 
develop the symptoms. The analogy to the plight of the stu- 
dents seeking help was striking. These individuals were face 
with alternatives, all of which were unpleasant in that all wou 
involve a disruption of their life plans. The student talking 
to the counselor was fully informed on all of the alternatives 
open to him. He appeared to be coming to the counselor 1? 
the hope that he would be able to find some other alternative 
that would represent a way out without unpleasant conse- 
quencese These students were under considerable tension; inde- 
cisive, and tending toward physical exhaustion. The state 
could be characterized as approaching psychasthenia. It could 
be said to differ from psychasthenia in that it depends more je 
sudden disorganizing crises of a type that can lend themselves 
to procrastination and are not as clearly a part of a long-ter™ 
behavior pattern of the individual. Perhaps one of the essen" 
tial differences would be that of degree and amenability f 
therapy. 

| It can be expected that problems of this type will 
in incidence during any period of social upheaval and гар". 
change. The writer has since encountered the same psycholog” 
cal state in returning veterans. One example is that of an p? 
service man in his middle twenties, married and trying t° mak 
up his mind whether he should go to college or accept imm 


increas’ 


edi 


: t 
ate employment. If he goes to college he realizes his fonde 


dreams, tries out his new-found confidence in himself, л 
makes it more possible to set his occupational aspirations : rk 
higher level. But also, if he goes to college, his wife has to W 


COUNSELING AND PSYCHOTHERAPY 181 


to contribute to their support. This postpones having children, 
raises uncertainties about his wife’s satisfaction, because she 
too would like to go to college, and postpones his own economic 
independence. On the other hand, accepting immediate em- 
ployment, even with some opportunity for on-the-job training, 
means resigning himself to a lower level of aspiration and giving 
up the chance for a college education. Neither alternative is 
free of unpleasant results. 

That this psychological problem is not confined to situations 
arising out of rapid social change can be illustrated by still 
another problem of choice anxiety. This is a case of a woman 
in her early thirties whose husband decided that marriage is too 
Confining for his catholic sexual tastes. She comes to the coun- 
Selor, presumably to obtain help in deciding what occupation 
she should train for in anticipation of the need to be indepen- 


ч. However, she appears unable to decide, while expressing 
cern about the need for decision and exhibiting symptoms 
Continuous tension. It is evident that her alternatives are 
oth Punishing, one to submit to the insecurities of life with x 
"Бапа ог the other to submit to the insecurities of life with- 


Out a В 
usband. d for individuals 


th am appears to be indicate 
th ne type open to enable them to pon me 
e e that they are “in for it" It is here шиш d meh 
from v dividual has accepted the fact t s 
ch ee there is no escape without i 
able oo symptoms will disappear vier E 
< to make a decision. It is further pei m 


Prog еп it is given to them direct Sd batt of 
thi ess. In the cases of the woman cited a , 
E ation to t 


: . 
tetumie about themselves in rel yn 
to fon 'g serviceman, the resolution o 
9W that course. 


ir problems seemed 


the clinician би 

. icized and widely 

асе іе that, if he works in a 

“ider a agency to which indivi 

“з ; “dividuals W 
Proportion of the inc1V 


182 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


not present definitely classifiable problems. For the most part 
they will be individuals who come to the counselor in the same 
spirit in which we might visit our doctor once a year for a 
physical checkup. In other words, they are playing safe. In 
an agency like the Student Counseling Bureau of the University 
of Minnesota, which is widely known throughout the state and 
favorably recommended by high-school educators, it is to be 
expected that many students will visit it as a safety measure 
at the time of entrance to the university which means a time 
of educational and vocational decision. These students ar? 
likely to say to the counselor, “I know what I want to do, but 
I wanted to see what you would say." True, this statement 
could also be a reflection of a defensive reaction against a feel- 
ing of self-conflict or dependence, and there is no attempt 0 
suggest that such a statement should be accepted as indicatng 
no problem. It is cited, however, as illustrative of the fully 
revealed reaction of the individual. Such individuals will 0807 
ally be very relaxed about taking tests. They will probably 
want to take a considerable number of them. When they have 
completed testing and have heard an interpretation of them, 
they will take the initiative very readily and terminate the 
interview in a short time. Another type of case that might be 
listed under this category is that of the student who uses his 
interviews, with or without testing, as the occasion for making 
up his mind. Other than furnishing the occasion, the counselor 
if he realizes it, does not need to play any role in the process. 
In addition to the hypotheses about treatment specific y 
each of the diagnostic categories which have been presente, 
above, a word should be said about certain general treatme? 
implications. Since there is general agreement that theraP^ 
starts with the first contact between the client and the ora 
selor, there cannot be a clear temporal demarcation betwee” th 


| : is 
тасма and treatment processes in the interview- pi 
raises the problem of what treatment processes are most " p 
tive in that period when diagnosis and treatment are deve 7 


E a ee It is suggested that during this introduct? g 
phase of the treatment process, the counselor’s objective § n 
be to enable the client to clarify his conception of his prob 


COUNSELING AND PSYCHOTHERAPY 183 


to develop insights into his own role and the counselor’s in the 
treatment process, and, where necessary, to give immediate 
release to dangerously pent up feelings. This points to the need 
for fostering client initiative and the exercise of alertness and 
Insight in responding to client feelings, embodied in the treat- 
ment processes so well described by Rogers. 

Does the suggested set of diagnostic categories meet the 
criteria more effectively than the set it is designed to replace? 
At this time only a partial answer is possible. There seems to 
be a firm basis for saying that the suggested set of categories 
are more clearly linked to differential treatment. Further, 
these categories are more closely linked than their predecessors 
to fundamental psychological concepts. However, the ade- 
quacy of this or any such set of categories cannot rest upon 
common-sense judgments alone. Their ultimate acceptability 
must be based upon actual demonstration that: (a) there isa 
Teasonable degree of agreement among counselors making a 
diagnostic judgment on the same client; (b) there isa greater 

€gree of randomness in the occurrence of various combinations 


ОР categories and a greater frequency of occurrence of clients 
i.) can be diagnosed as belonging to only one category than 
ligit of the previous set; (c) the diagnoses do in Вя a E 
> ктеп ау effective treatments; (4) а а: 
Preka, ending of clients’ results, as isdat ds kom е 
With nee of predictions being associate 

Old set. 
upon з final point shouldbe made. Even thou 


п * . B 
ated Which this set of diagnostic catego 


Proy, © *PPears unlikely that а 
hari to be the most effective an 
the У likely, assuming the validity o 


Wr ? . . . o 
deep -ters experience and insight У i 
account a 


gh the rationale 
d be substanti- 


pecific categories will 
It is 


Psygp tough to have taken into eae 
te jg ological bl hat could fall within this Е (б, 
ЕЕ observation within this 


r more fundamental 


Work ore likely that further 
ns of the present 


ae eld reveal additional 
D 
neg tes that would grow out of com 


categories ОГ! 
binatio 


184 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Summary 


This paper has presented an analysis of the place of diag- 
nosis in counseling and psychotherapy. It has attempted to 
demonstrate that diagnosis is a necessary process in treatment 
and in the types of research that will provide the basis for the 
improvement of treatment. Diagnostic concepts now used by 
counselors in educational institutions were examined in terms 
of criteria of meaningfulness, statistical characteristics of po 
pendence, and relation to choice of differential treatment. Ы. 
examination suggested that the present diagnostic —— 
based on environmental or sociological constructs are not a A 
quate, and a new set of concepts based upon psychological con 
structs was suggested. 


REFERENCES 


І. Allport, G.W. “The Ego in Contemporary Psychology- Psy 
chological Review, L (1943 ), 451—478. amic 
2. Bordin, E. S. “A Theory of Vocational Interests as уе 
Phenomena.” Educational and Psychological Mea 
ment, IIT (1943), 49-66. Factor 
3. Edwards, A. L. “Political Frames of Reference as nd Socia 
Influencing Recognition.” Journal of Abnormal ana 
Psychology, XXXVI (1941), 34-50. ult of 
4. Edwards, A. L. “Rationalization in Recognition as a Res Tan 
Political Frames of Reference.” Journal of Abnormat 


Social Psychology, XXXVI (1941), 224-235. New 
5. Guthrie, E. R. The Psychology of Human Conflict. 

York: Harper and Brothers, 1938. Рр. 408. he Rat- 
6. Maier, N. R. F. Studies of Abnormal Psychology in the 


Harper and Brothers, 1939. Bp. 81. lin and 
7. Raimy, V. C. The Self-concept as a Factor in Counsel fa te 
Personality Organization. Ph.D. Dissertation, Ohio 
University, 1943. 
8. Rogers, C. R. Counseling and Psychotherapy. 
Houghton-Mifflin, 1942. Рр. 408. : hods of 
9. Thorne, F. C. “A Critique of Nondirective Met nolo8)? 
Therapy.” Journal of Abnormal and Social Psy¢ 
XXXIX (1944), 459-470. | сеїесїї© 
10. Wallin, К. “Ego-involyement as а Determinant of olog} 
Forgetting.” Journal of Abnormal and Social Psy¢ 
XXXVII (1942), 20-39. к: Me 
11. Williamson, E. G. How to Counsel Students. New York: 
Graw-Hill Book Company, 1939. Pp. 341. , work 
12. Williamson, E. G. and Darley, J. G. Student Personne; 313. 
New York: McGraw-Hill Book Company, 1937. £P: 


New york: 


THE PREDICTION OF ADJUSTMENT IN MARRIAGE 
CLIFFORD R. ADAMS 


Pennsylvania State College 


Nearty 20 years ago Hamilton (6) using an interview pro- 
cedure studied the marital satisfaction of 100 men and 100 
women. He employed some 13 questions to appraise the indi- 
Vidual degree of marital happiness. More recently both Bur- 
&ess and Cottrell (4) and also Terman (8) have developed 
Comprehensive questionnaires that they believe to be helpful in 
Predicting happiness or adjustment in marriage. Their scales 
are based upon extensive studies of couples already married and 
€ach scale correlates about .50 with marital happiness as evalu- 
ated by their somewhat similar techniques. 

In 1939 the writer began testing single college students and 
Couples, in many cases before they were engaged, to see if in- 

Ormation obtained before marriage could be used to predict 
adjustment after marriage. With the kind permission of Dr. 
€rman his prediction scale was reproduced. This form and 
the Adams-Lepley Personal Audit (2) were administered dur- 


Ing the period 1939-1945 to nearly 4000 students at The Penn- 
he Guilford-Martin Personnel 


Sylvania State College. Later t 
nventory | (5) was also used. As students have married 
Many have been willing to comp 
Some measure of marital adjustment. 
Description of Premarital Test Forms 

. The Prediction Scale for Happiness (8) consists of 143 
Items divided into four parts. Part I, Interests and Attitudes, 
Includes 54 items taken from the Bernreuter Personality Inven- 
tory (3). Part II, General Likes and Preferences, 1s made up 
9f 54 items from the Strong Vocational Interest Blank (7). 
Part TIT. Your Views About the Ideal Marriage, is composed 

> 
185 


lete questionnaires that furnish 


186 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


of 24 questions dealing with husband-wife relationships. The 
last part, Parents and Childhood, has 11 items dealing with 
family background. Three questions about age, SCX, and edu- 
cational level precede Part I. m. 

The Personal Audit is made up nine tests, each consisting 
of 50 items. According to the senior author (1) these measure 
the relatively independent personality factors of seriousness 
(D, firmness (ЇЇ), tranquility (11), frankness (IV), stability 
(V), tolerance (VI), steadiness (VIL), persistence (УШ), апі 
contentment (IX). 

The Personnel Inventory I consists of 150 questions. The 
authors say that these measure the three factors of objectivity 
(O), agreeableness (Ag), and cooperativeness (Co). 


Appraisal of M arital Happiness 


Terman had used essentially the same items with certain 
modifications and additions to evaluate marital happiness а, 
Burgess and Cottrell had employed in their Index of Marita 
Adjustment (4). A decision was made to use their basic pen 
tions but where the two versions differed to any appreciab P 
extent to include both forms of the items. Hamilton's 13 ques" 
tions were also added. Ву scoring these three sets of question” 
with the techniques developed by each author, three separat 
marital adjustment scores result: Terman, Hamilton, an 
Burgess and Cottrell. А d 

Some 20 questions about education, length of courtship y. 
marriage, parental approval of marriage, etc. were asked 
well as 13 specific questions dealing with sexual adjustment 

When a student marries for whom premarital test orms Ў 
available, he or she is asked to complete the questionnaire 
marital adjustment. Two forms are sent and both spouses,’ 
asked to fill them in independently. In no case is а questi? 


f 
: а ; EE 50 
naire submitted until the couple has been married six mont 
longer. 


re 


Characteristics of the Cooperating Couples > 


This report is confined to 100 married couples. Both ne 
band and wife returned the questionnaires. According t° 


ADJUSTMENT IN MARRIAGE 187 


husbands, their average age is 26.35 years; that of their wives 
is 24.13 years. According to the wives, their average age is 
24.30 years; that of their husbands is 26.18 years. The average 
length of time married is 2.36 years. The 100 couples have 44 
children: 18 boys, 26 girls. 

The average length of acquaintanceship before dating was 
4.5 months; length of courtship before engagement was 22 
years. The couples were engaged approximately 8.5 months 
before marrying. 

No person completed less than one year in college and, with 
the exception of husbands drafted into military service, most 
of the spouses are college graduates. The questionnaires used 
in this study included only those couples living together or able 
to see each other frequently if the husband were in uniform. 

The average age of husbands at marriage was 24 years, of 


Wives, 22 years. 
Marital Adjustment Scores 


In Table 1 are shown the average happiness scores earned 
by the couples. It will be noted that regardless of the scoring 
TABLE 1 


Marital Adjustment-Happiness Scores of 100 Married Couples 


Burgess-Cottrell 


Terman Hamilton 
Husbands Wives Husbands Wives Husbands Wives 
Mean ... 76.85 75.95 10.55 10.22 169.60 163.75 
BE Devroe St wee 2.70 30 1900 2175 


nds tend to earn higher mean adjust- 


technique employed husba 
heir scores tend to be less 


ment scores than do wives and that t 
variable. 


The average scores earned by 
those reported by Terman, Hamilton, and Burgess and Cottrell. 


Terman’s husbands had an average score of 68.40; the wives, 


69.25. Hamilton’s men earned a mean score of 6.58; the women, 
score of 140.8. 


5.92. B gegg reports ? mean . 
Peer rt Tm 
„orce; 3 hg | ont \ ote: Terman, 59; 
5 page A had el E у! ee de WW 


these couples are higher ап 


O 


188 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Hamilton, 5; Burgess-Cottrell, 129. Of the 100 wives, 12 had 
seriously contemplated separation including 6 who had seri- 
ously contemplated divorce. The highest scores earned by 
those contemplating separation were: Terman, 70; Hamilton, 
8; Burgess-Cottrell, 157. The highest scores earned by those 
who had contemplated divorce were: Terman, 61; Hamilton, 6; 


Burgess-Cottrell, 144. 
In Table 2 the correlations found between the measures of 


marital adjustment are given. Burgess and Cottrell using their 


TABLE 2 
Correlations of the Measures of Marital Adjustment 


Husbands Wives 
т Р.Е. т Р.Е. 
Terman and Hamilton .......... * 37 .03 .80 02 
Terman and Burgess-Cottrell .... .78 .03 83 02 
Burgess-Cottrell and Hamilton ... 74 03 76 03 
90 be- 


own and the Terman weights of scoring found an 7 of 
tween the two sets of resulting scores. That is somewhat higher 
than the 7’s of .78 and .83 obtained in this study. Howeveh 
the 7’s are of sufficient magnitude to indicate that the three 
methods of appraising marital adjustment are largely sampling 
the same complex of factors. 

The correlation of the Terman scores of the husbands a” 
their wives was .84 + .02 indicating a satisfactory degree 9 
reliability for the Terman method of evaluating marital hapP” 
ness. 


Prediction of Marital Adjustment 


administered before marriage and the adjustment ог sa 
tion scores of the 100 couples after marriage. The P 
Prediction Scale has significant, although not high, positive wn 
relation with the three measures of happiness or adjustment d 
marriage for both husbands and wives. It has demonst!@ e^ 
the belief of its author that it might have some value in i 
dicting success in marriage. 

The Personal Audit shows several significant cor 


relation? 


ADJUSTMENT IN MARRIAGE 189 


TABLE 3 
Correlations of Premarital Tests With Adjustment in Marriage 


Adjustment-Happiness in Marriage 
Burgess-Cottrell 


Terman Hamilton 
Husbands Wives Husbands Wives Husbands Wives 
Terman Prediction Scale — .32 38 24 25 30 32 
ersonal Audit 
Seriousness -.02 .01 .02 -.09 2107 
IIIDÉéSS. 4: : s xis -.01 .03 .03 .02 -.05 
Tranquility .02 4 04 15 08 
rankness 13 22 19 13 26 
tability шуу........ 28 00 Al -.07 20 
Е аы i o2 0-0 O0 -10 03 
Steadiness ae ee Ў 07 08 02 12 06 
Persistence | «05 -05 -01 -.08 n 
Ontentment " 23 05 13 08 . : 
l, frank, and á 


T , ; 
iie 15 the suggestion that men who were tranqui 
be happier in marriage than 

Girls who 


Ste 
aly before marriage are likely to pus 

Wi Se who were irritable, evasive and emotional. me 

1276 frank, stable, and contented before marriage 

d iage than those who were 


lik 
e A . 
IT to be well-adjusted in mart 4 
Sive, unstable, and worried or discontente I were computed 
on] orrelations for the Personnel Inventory er bands, f 
con With the Terman adjustment gore КО alena, 1 
telations were 1] for objectivity, 16 for е жарна respec- 
tiy Perativeness; the correlations for the wives 
y 09, .18, and 21. thé premarital tests 
ile none of the correlations aped Ra high, ueri 
adj : ria : 
ate djustment-happiness after marriag ful in premarital 


ed ibly help 
Coy Ound to be significant and possib y 
nse ing, 
TABLE 4 erman Prediction Scale for 


Correlations of the Personal m y die T Seniors 
Unmarried College 


Audit 
The Personal 
Hr IX. 
VII. p Соп. 


V. 
І п. Ш. ТУ. Sta. 


Sk ЕБ De Me 
-17 99 


a o9 -U 3 
M uw -08 


190 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


In an earlier study (1), Terman's Prediction Scale was cor- 
related with scores made on the Personal Audit. The single 
males (221) and single females (206) were college juniors and 
seniors. These 7’s are shown in Table 4. It will be noted by 
comparing Table 4 with Table 3 that those parts of the Audit 
correlating significantly with the Terman Scale for unmarried 
students not paired with each other are the parts tending to be 
correlated with adjustment-satisfaction after marriage. 

The Personnel Inventory I was administered to 200 college 
men and women. These students were single but were dating 
steadily or engaged to each other. The resulting scores were 
correlated with the Terman Prediction Scale. The factor of 
objectivity correlated .25 with predicted happiness; agreeable- 
ness, .21; and cooperativeness, .23. 


Homogamy of Scores 


The Terman Prediction Scale сап be scored іп two ways: 
alone or paired. When single scores of 100 dating and engage 
couples were correlated, the r was .28; the correlation of i F 
paired scores was .30. The respective т for our 100 marne 
couples were: single, .39; paired, .43. 

In Table 5 are shown the correlations of the paire 
on the Personal Audit for the 100 married couples. Five © 


d scores 


TABLE 5 


Correlations of Paired Scores of 100 Married Couples on the Personal A 
Administered Before Marriage 


ийй 


L JL IL їй X ху We VU BS 
Ser. Fir. Tra, Fra. Sta. Tol Ste. Рег. 
Paired couples .. 29 .06  -08 0 49 24 17 Al 28 
z әп А indi- 
these correlations approach significance, suggesting that ae 
viduals tend to select mates whose personality traits bear Eo. 
resemblance to their own. This would seem to be the case s 
a 


the traits of seriousness, stability, tolerance, persistence; 
contentment. 

On the basis of chance alone 27% of men would 
certain limits of the women with whom they were ran 
paired. On the Audit sub-tests for the 100 married coup 


score withi? 
dom 
les t e 


ADJUSTMENT IN MARRIAGE 191 


el уан e to pair was 35%, the highest was 79%. 
E s the earning the highest happiness-adjustment 

E , the percentage of agreement on the nine Audit 
Parts ranges from 40% to 83%. 

It is also of interest that wives tended to marry men who 
were less tranquil, less frank, less stable, and more tolerant than 
they were. 

Difficulties in Marriage 

All was not sweetness in the marriages of these 100 men and 
Women. Some composite percentages show that 775 of the 
Couples had few to no outside interests to share together. The 
most frequent disagreements occurred in respect to demonstra- 
tions of affection, friends, ways of dealing with the in-laws, and 
intimate relations. The husbands believed they gave in, the 
Wives believed they gave in, when disagreements arose. Only 
6 wives and 4 husbands frequently to occasionally regret their 
marriage and they say they would marry a different person if 
they had their lives to live over. 57 wives and 62 husbands 
say they are extraordinarily happy, 5 wives and 4 husbands 
admit their marriages to be less happy than the average. 86 
Wives and 91 husbands confide in their mates in all or most 
things. Presence of children did not have any bearing on 


marital happiness. 


Specific Sexual Adjustment 

Forty-six wives and 51 husbands say they are perfectly 
adjusted sexually; 11 wives and 18 husbands say they are 
almost perfectly adjusted; 21 wives and 16 husbands say there 
could be some improvement; 15 wives and 9 husbands say they 
are not too well adjusted; 5 wives and 6 husbands say they are 
Poorly adjusted; 2 wives say they are not at all adjusted. Part 
of these difficulties probably stem from the fact that some of the 

Usbands were in military service. 1 
Eighty-seven wives and 90 husbands say their mates are 
Sexually very attractive to them. Only 5 wives and 3 husbands 
admit there is no attraction. 22 wives have a sexual climax 
always, 42 have it usually, 23 have it occasionally, 4 rarely, 9 
ave never had one. 50% of the wives achieved orgasm during 


192 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the first month of marriage; 15% within 2 months; 10% in 3 
months; 8% in 6 months; 4% in 9 months; 2% in a year; 1% 
later; 8% never; 2% did not specify. 13 wives said they 
reached a climax in intercourse before their husbands did; 23 
said it was “together”; 49 said the husbands reached it firsts 
15 did not answer. To the question “Is your mate willing to 
have intercourse as often as you wish it,” 23 husbands and 10 
wives replied “less often.” To the question “Are you able to 
have intercourse with your mate as often as the mate wishes 
it,” 8 husbands and 18 wives replied “less often.” 

The question about the relationship of the strength of 
drive to the menstrual period brought these answers from the 
wives: desire strongest before period, 15; during period, 4; after 
period, 36; little difference, 39; no answer, 6. 


sex 


Summary and Conclusions 


Prior to their marriage to each other 100 men and 100 
women were given tests thought to have value in predicting 
happiness-adjustment in marriage. When these couples ha 
been married an average of 2.36 years, husbands and wives 
independently completed questionnaires believed to be mear 
sures of adjustment or happiness in marriage. These question” 
naires were scored by three different techniques. Pro luct" 
moment correlations were then computed between these adjust 
ment scores and the premarital tests. 

Certain tentative conclusions are cautiously presented: 


1. Adjustment-happiness in marriage can be measure 
reliably. 

2. Husbands earned slightly higher happiness Scores a 
had less seriously contemplated separation OF divo" 
than wives. m 

3. The three tests of marital adjustment correlated ee 
.72 to .83 indicating that they were fairly compar, 

4. While correlations were not of high magnitude, the E 
man Prediction Scale seems to have some value 10 
dicting marital happiness. 25 

5. Men who were found tranquil, frank, and stable ie 
appraised by the Adams-Lepley Personal Audit 5€ 


І 
\ 


, The limitations of this 
Ouples studied, the shortness of the length of tim 


ADJUSTMENT IN MARRIAGE 193 


эше ы appeared somewhat happier in marriage than 
ОТИ 

marraige indicated 
frankness, stability, and contentment appeared to be 
happier in marriage than those who were evasive, un- 
stable, and discontented. ' 
Significant resemblances in personality traits were found 
between husbands and wives, especially on the traits of 
seriousness, stability, tolerance, persistence, and con- 
tentment as measured by the Audit. 


study include the small number of 
e married, 


an 3 x CAM 
d the fact that the husbands in some cases were тп military 
Service. 


1. R 
Adams, Clifford R. Manual of Dir 


Adams, Clifford R. and Lepley, 


S 2 
trong, E. K. Vocational In 


REFERENCES 
ections for Using and Inter- 


preting the Personal Audit. Chicago: Science Research 


Associates, 1945. 
William M. The Personal Audit. 


Form LL. Chicago: Science Research Associates, 1945 
ernreuter, Robert G. The Personality Inventory. Stanford 


University: Stanford University Press, 1931. 
S. Predicting Success 


Burgess, Ernest W. and Cottrell, Leonard 


or Failure in Marriage. New York: Prentice-Hall, 1939. 
1 G. The Personnel Inventory I. 


Guilford, J. P. and Martin, H 


Beverley Hills: Sheridan Supply, 1943. 
d 9m po New York: Albert 


Hamilton, G. V. A Research in Marriage. 


and Charles Boni, 1929. Au 
terest Blank. Stanford University: 


Stanford University Press, 1927. .. à | 
erman, L. М. Psychological Factors in Marital Happiness. 


McGraw-Hill, 1938. 


CONSTRUCTION AND ANALYSIS OF WRITTEN 
TESTS FOR PREDICTING JOB 
PERFORMANCE: 


DOROTHY C. ADKINS 
United States Civil Service Commission 


І. Test Construction 


А.р efining What is to be Tested by Written Tests 
Construction of a written test requires defining what is to 
be tested. Although this statement has been parroted to the 
extent that it seems platitudinous, clear understanding of what 
1 implies is not commonplace. Broadly stated, we must dis- 
Cover those areas in which individual differences should be 
reflected in test scores. We do not want to test in areas in 
Which individuals do not differ significantly or in areas in which 
individual differences that do exist are not critical. In the 
selecton of social workers, for example, there is little need to 
арргаіѕе the ability to write legibly (even though typists may 
isagree), because, through the operation of other selective fac- 
tors, social work candidates do not differ significantly in this 
spect of performance. Measuring height has not been con- 
Sidered essential for social work positions, although it may be 
Considered desirable for evaluating candidates’ fitness for mem- 
ership on the police force. This is a characteristic in which 
Wide differences among candidates may be anticipated but in 
Which such differences are probably not important to success 


social work. 


With proper coordination ofk r 
of test techniques and subject-matter competence applied to 


defining appropriate test content, useful predictive instruments 


Can be constructed for any professional field in which individual 
a B 
1This article, with minor changes, is reprinted through the courtesy of The 
Compass, XXVII (1940), 24-30, for which it was prepared at the invitation of the 
Civi Service Subcommittee of the American Association of Social Workers. 
195 


nowledge and skill in the field 


196 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


differences in performance can be reliably and independently 
established. If the production of typists reflects wide indi- 
vidual differences that can be measured, then we approach with 
confidence the construction of tests to predict these differences. 
Further, the accuracy of this prediction can be inspected. 
Every one will agree that social workers, too, do differ in Jo 
performance. If competent judges can agree, also, on whic 
social workers are superior and which ones inferior, great im- 
provements in the selection of social workers should be feasible, 
and the extent of improvement should eventually be determi- 
nable. 


B. Supplementation of Written Tests by Other Examining | 
Methods | 


metricians | 


There would be a general consensus among psycho 
d be only 


that, for the present, at least, the written test shoul 
one part of the total examining process for professional an 
administrative positions. The great majority of industries» 
civil service jurisdictions, and licensing bodies have require | 
and doubtless will continue to require for some time to comes 
certain numbers of years of particular types of education and/of 
experience. For competitive purposes persons surpassing ¢ ° 
minimum requirements are assigned higher scores, depending 
upon amount, pertinency, and recency. Methods of appraising 
education and experience have almost of necessity assumed that 
all persons who have been exposed to educational courses M i$ 
appear similar and who have drawn salary for work that Se 

to be of the same relatively broad type have profited from the 
education and experience to identical degrees. This we kn? ; 

is far from true. An analysis of the hypotheses that are ma ot 

or could be made in evaluating training and experience will P 

be treated at length here. We may, however, go as far as tl 
venture that ratings of education and experience in time " 
be replaced by objective tests that measure, instead, the eo | 
of а person’s education and experience on his knowledges, $ 

and abilities. Such a test may require two weeks of the Ў е 
jects’ time for all we know; and no one would claim that we 2 


prepared for such a step now. | 


PREDICTING JOB PERFORMANCE 197 


Delimitation of the areas of the written test for employ- 
ment purposes thus far has proceeded on the assumption that 
no satisfactory paper-and-pencil tests have been developed for 
testing personality traits such as dependability, tact, co-opera- 
tiveness, and the complex we know as “the ability to get along 

- With people.” Nor can any written test we know guarantee 
that passers’ behavior will reflect socially desired attitudes. 
"This whole area has been relegated with fleeting compunction 
to the oral interview, which has been subjected to far too little 
ime scrutiny but which, again, is outside the scope of this 
article. 


C. Definition of Fields in Relation to Test Validity 


After this digression, perhaps we can agree that for the pres- 
ent the use of a competitive written test may be limited to 
appraising pertinent knowledges, skills, and abilities that are 
distinct from personality factors. In some cases delimitation 
of the field to be tested and of criteria for exploring the validity? 
of the test is relatively simple. The purpose of a test may be, 
for example, to appraise knowledge of the 45 sums of pairs of 
numbers below 10. Here the field is in a real sense the 45 addi- 
tion problems. The simplicity of this situation is somewhat 

'€ceptive, for even here questions immediately arise in rela- 
tion to test form and content that bear on the need for further 
definition of the field. How should the problems be presented, 
In written or oral form? Should numerical or verbal symbols 
be used? What type of item (1.е., completion, true-false, multi- 
Ple-choice, etc.) should be used? What should be the method 
of indicating answers? Should the test be a power test or a 
Speed test? If the latter, what time limit should apply? ре 
all 45 problems have to be presented or will a sampling suffice? 

t is clear that the original objective must be detailed more 
Precisely. If the objective of a segment of teaching is the abil- 
Ку as at the end of a particular time interval to add all of the 
Pairs correctly in, at most, say; three minutes, when the pairs 
are presented in a particular way, then a three-minute test con- 


2 The common definition of the validity of a test, that it 3 m citu МЫП 
the test measures what it is supposed to measure, 18 accepted ior purp 


iscussion, The criterion is that which we are trying to measure. 


198 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


sisting of the 45 problems presented in the particular way and 
at the end of the prescribed learning period may be considered 
valid for this narrow purpose without further study. 

If the problems are not presented in the defined way, if there 
is time for only a two-minute test, if a different format is desired 
for test purposes, if only a sampling of the 45 problems is pre- 
sented, if it is desired to predict ability to learn to do more 
complicated problems in addition or in the progressively broader 
fields of arithmetic and mathematics, then one may need to 
make a special study to determine degree of validity. The 
appropriate criterion would vary with the particular SEM 
shorter test may be correlated with the 45-item test to estimate 
the validity of, say, 15 items for predicting score on the entire 
"field" of 45; a 45-item test of ability to add given in the secon 
grade may be correlated with marks in freshman algebra if one 
is willing to wait that long at the risk of disappointment; an 
so on. 

Thus what seemed to be a simple task of defining a field to 
be tested leads to some rather complex examining problems: 
These become more intricate as the scope of the field to be 
covered expands to cover objectives of a single course of stu y 
and of a school curriculum over a period of time, and even more 
so when an attempt is made to place persons in rank order ОЛ 
the basis of predicted job competence. In the latter case t ў 
initial task is to identify a group of knowledges and аре 
which may influence job performance and hence which, * 
hope, will predict job competence. Ideally the criteria by wines 
the validity of the test can be estimated should be establish? 
in the very early stages of constructing the test. We nee st 
be, however, and in fact rarely are, entirely right in ош 
approximations to the prediction of the criterion. Nor °з е 
have to know the optimal weight for each type of knowle P 
and skill in undertaking construction of a test, if we have pe st 
reliable and independent measure of job competence аба! is 
which to appraise how well our test serves its purpose. ple 
unfortunate that such a criterion is all too rarely obtain? 
On the other hand, an attack on a prediction problem 1? ү 
particular field need not start from scratch but can сар 


і 


» 


PREDICTING JOB PERFORMANCE 199 


on results of previous work in other fields. Every type of item 
and all conceivable kinds of knowledge and ability do not have 
to be explored. 

In the process of defining knowledges for test construction, 
a broad field like social work is broken down into more specific 
areas, such as knowledge of social case work. As the actual task 
of constructing a test is more nearly approached, knowledge of 
Social case work is further subdivided into the more detailed 
facts, concepts, and judgments that constitute this area, until 
the breakdowns themselves directly suggest test items. No test 
will include every possible breakdown. It is here that the sta- 
tistical concept of sampling enters. Just as we can attempt to 
Measure the ability to solve 45 problems by testing on only 15, 
SO We attempt to measure knowledge of an entire field of per- 
haps several thousand items by testing on only one or two 
hundred. Although a detailed analysis of subject matter is not 
Made in a formal way each time a test is constructed, a compe- 
tent examiner uses this approach at some stage, perhaps only 
after a group of items is tentatively assembled, as a check on 


the adequacy of the sampling of the field. 


D. Tes ting Abilities as Well as Knowledge f 
It is usually considered profitable in testing for professional 
Competence to test abilities in addition to knowledge. Of two 
Social workers who have acquired the same factual knowledge, 
t is reasonable to presume that the one who is more intelligent, 
Who is better able to think through a problem, and who can 
Meet a new situation more readily, is more likely to be compe- 
tent on the job. If this is true, we must not limit our tests P 
Nowledge alone, but can use profitably results already avail- 
able in the field of abilities testing- h 
Psychologists are not yet in complete agreement Me e 
Question of general intelligence versus more specific a we 
Many believe that there are not only a number of specific abili- 


Ues, such as verbal reasoning and ee pi ap 

also a gene Oth convinced that general 1 E 
ra D thers are 

s itn letely known 


Sence is just a sort of average of an as yet сотр ‹ А. 
Odgepodge of specific abilities. Whichever view 15 correct, 1 


200 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


is certain that ability tests can be constructed which are useful 
in predicting academic achievement and on which occupational 
groups at the upper professional levels will do better than those 
at the lower levels. We may be practically sure, too, that if we 
had adequate measures of basic abilities social workers and 
accountants would be found to differ significantly in the pattern 
of abilities most conducive to success. 

Granted that written tests may be used to measure both 
knowledge and ability, question arises as to the value of separ- 
ate tests of knowledge and ability versus a single test that 
attempts to measure both. So far as I know, this question has 
never been satisfactorily answered by an experiment, which A 
of course, what is needed. Both approaches have been use 
With separate tests, the test of knowledge of subject matter " 
likely to be constructed from too limited a point of view; А 
tends to test mere memory for facts. The test of abilities, id 
the other hand, may appear to the candicates to bear little re н 
tion to the job. From the standpoint of public relations, к 
can offer quite convincing arguments in favor of a single test E 
cover both knowledge and ability. Efforts in this direction 
sometimes palpably absurd. If one tries to construct айу. 
disguised intelligence test, or an abilities test “flavored” W" 
subject matter, he may end with some such farce as “If a ue 
worker adds 2 and 2, what answer should she obtain?" vun 
he is interested only in knowing whether the candidate can p 
F'rom the standpoint of nicety of measurement and disregart = 
the element of public relations, doubtless psychometricia” 
would prefer to try to measure separately various — A 
knowledge and ability, or at least to identify by one O a 
mathematical factor analysis techniques the several e 
nents that enter into scores on a single test. This latter e 
proach may prove to be fruitful, especially since it ET o 
impossible to prepare a written information test that is ter^ 
appreciably affected by verbal factors of the sort that de 
mine reading ability and verbal reasoning. 


E. Constructing Individual Test Items rob” 


So far we have touched upon some of the more general p e 
lems of how tests are developed. Let us turn briefly to 


PREDICTING JOB PERFORMANCE 201 


of the more specific considerations that bear on the task of con- 
structing the individual items that go to make up atest. А test 
item, whether free-answer (essay) or objective, should present 
a definite and clear task. It should elicit responses of such a 
nature as to permit the inference that persons who respond in 
опе way will differ from those who respond in other ways. A 
test item for predicting job performance should be such that 
the inference can be correctly drawn that persons who give one 
answer (or type of answer) will be, on the average, better quali- 
fied than those who give other answers. Such prediction made 
from a single item is not very trustworthy. Few single bits of 
information are essential. But if each of a group of items is 
discriminating in the right direction, even though with imper- 
fect accuracy, then a prediction based on the group of items can 
be made with greater dependability. 

Large-scale test development projects are confined largely 
to tests of the short-answer type. This form, as against the 
essay, has the advantage that the scoring can much more 
Teadily be made objective and reliable, so that a candidate's 
responses yield the same score when evaluated by different per- 
Sons and so that a candidate obtains the same relative score 
When he takes different forms of the same test. The objective 
type permits a much broader sampling of the subject matter 
?r abilities that it is desired to test. This opportunity for wider 
Coverage permits increased reliability and validity. Tt may be 
Claimed that objective tests are useful only for testing posses- 
Sion of factual knowledge; and it must be conceded that many 
of them in the past have tested little else. Fortunately, this 
!5 not an inherent defect of the form, but only a limitation of 
the item constructors. Objective tests can be developed to test 
the abilities to draw proper conclusions from given data, to 
Select which one of several principles applies, to classify and 
Organize data, to select the data necessary to solve a е 
Ог to solve problems in unfamiliar context or with insufficient 


ata, 
In the construction of an objective test item, some test tech- 
i problem 


"cians ld say that the first consideration is to set a 

Or tak Tips will be dient to all of the na fa 
; fa di, г on. 

amend this ptiticiple to indicate that the task clear only 


202 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


to the better candidates. In general, the task may be but need 
not necessarily be clear also to the poorer candidates. If only 
the better candidates understand the task and give a particular 
response while the poorer candidates do not understand the 
task and hence do not give the same response as that of the 
better candidates (except by chance), the item may be just | 
discriminating as an item in which all candidates understané 
the problem, and possibly more so. One word of caution, how- 
ever, should be inserted. We must guard against achieving 
difficulty merely by giving undue weight to verbal factors when 
what we are interested in appraising is understanding of basic 
concepts. 

Good test items cannot be merely excerpted from a ‘ 
Although written source materials are of inestimable value E 
the item constructor, his is a creative task of selecting conten 
that will be appropriate and likely to yield a selective item; 
developing that content into a statement of a problem, - 
perhaps several alternative solutions to the problem; of e 
ing suitable qualifying statements; of adding necessary coni m 
and deleting unnecessary verbiage; of presenting controversi 
issues without making a commitment; of avoiding spen 
determiners” or extraneous clues to the answer; of putting : с 
concept into a form that provides а natural way of asking M 
question and that at the same time provides ease and oH 
tivity of scoring. These are only a smattering of the farts 
to be considered in constructing a test item. 


book. 


er : ks . о 
Е. Participation of Subject-Matter Specialists in Test C 


struction 


[2 

Whether an item will differentiate the competent from © 
incompetent is a matter of the examiner’s judgment until put 
is opportunity to try the item out on persons whose job mi 
formance is known and to discover whether it does, in fact, Y e 
the desired discrimination. For this reason participation P ds 
construction of examinations in specialized subject-matter je 
by persons who know the subject matter is highly 98572 ү 

Even after they have been trained in test constructio?» 
every item they construct or tentatively approve can a ch 
pected to be useful Nevertheless, such items have 


PREDICTING JOB PERFORMANCE 203 


greater chance of yielding discrimination than those prepared 
solely by test technicians with no competence in the subject- 
matter field in question. For purposes of argument it may be 
admitted that, given sufficient time, the psychometrician, like 
the monkey at the typewriter, could write every possible test 
item in the subject-matter field. Then, granted opportunity 
for trying out the items on persons of known competence, it 
would be possible to select items for assembly into a test that 
would be just as good as the test that could be assembled after 
à tryout of items prepared in collaboration with subject-matter 
consultants. The former test, composed of a selection of items 
based on a tryout of a large number constructed solely by psy- 
chometricians, would be much more costly to construct. If, as 
has often been the case when there is immediate need for a test, 
there is no time for tryout at all, injection of subject-matter 
Competence into the initial development of the test is even more 
essential. Aside from improving the chances that tests will 
Predict job performance, participation of subject-matter con- 
sultants makes a testing project more acceptable to the profes- 
sional field in question, especially in the early stages of the 
Project, at a time when such acceptability is often critical. 

If in the case of an employment test the item constructors 


are sufficiently familiar with the subject matter and abilities 


Concerned, if they have had sufficient access to job information, 
and if they have been fairly ingenious in selecting content for 
Items and in working the content into item form, there is rea- 
Sonable expectation that predictions of job success on the basis 


of Scores on a large group of items will be significantly better 
ап forecasts made without a test. The expectation of in- 
creased efficiency of prediction may be verified and the extent 
AE d by research methods. 


ш. improvement determine 
П. Analysis of Test Results 


b The Concept of а Standardized Test 


e es: y i Ч rdized 
d good things of “standa 
t па ѕ all of you ha an is k i Е i 


M pur a esignated. I 
d. ence more te desired than tests Ehe F 
"gest, however, that the term lacks а РГ 
> 


204 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


that various tests to which it is applied differ markedly with 
respect to some of the very characteristics on which “standardi- 
zation” may be taken for granted. A standardized test almost 
invariably is accompanied by directions for administration 
which are to be followed by whoever administers the test. The 
directions cover such factors as uniform timing and provision 
for the same amount of instruction and practice material for 
all subjects, so that everyone takes the test under essentially 
the same conditions. A standardized test also almost always 
comes equipped with a scoring key, and the scoring is typically 
objective. Thus the effects of administration and scoring of the 
test by different persons are minimized. Fairly frequently, 
normative data are provided with standardized tests, showIDE 
standards of attainment for various groups for which the test 
is thought to be appropriate. For example, the frequency dis- 
tributions of raw scores on an achievement test in the soca 
studies may be given for 8th-, 9th-, and 10th-grade pupils, t9" 
gether with some measure of the average and variability for 
each grade. These distributions are obtained by administering 
the test to sample groups of pupils in each of the three grades: 
To the extent that the samples are large enough and sufficient y 
representative, the results may be applied to other groups: 

Ideally, standardized tests not only have these character 
istics, but they also have been subjected to more refined 1°” 
search to establish their difficulty, reliability, and validity, os 
equivalence of different forms provided, and the appropriate 
ness of the weights at which separate parts are combined. 
practice they differ appreciably in the extent to which rese? zi 
methods have been applied in investigating these charact® 4 
istics. We shall discuss in greater detail the concepts involve 
in such research, because they constitute the core of statisti 
analysis of test data. 


rch 


s 7 
В. Test Difficulty Б 


: А h 
| The basis of all approaches to the problem of analyzin& "ii 
difficulty of a test or of a test item is the performance 00 


P er? 
test or item of a defined group of subjects—say male Sh-grad" 
in urban schools, white 16-year-olds in Alabama, or all pe 
who have filed a particular civil service examination applica 


— PPS 
+ 


PREDICTING JOB PERFORMANCE 205 


on time, met whatever minimum requirements there are, and 
appeared for and completed the written test. Particular atten- 
tion is called to the importance of the definition of the group; 
difficulty for another group may differ considerably. 

In the case of a test as a whole, difficulty is determined by 
the frequency distribution of the scores of the defined group. 
For a single test item which is scored either right or wrong, diffi- 
culty is determined by the percentage of the defined group who 
get the item correct? It will be seen that this percentage is 
only a special case of a frequency distribution. 

The difficulty of a test (and hence of its component items) 
has an important bearing on its value for its purpose. A test 
that is below the level of abilities of the poorest subjects is of 
no value in discriminating among the subjects. Similarly, a 
test that is too difficult even for the best subjects gives us no 
information for predicting which subjects are superior. It is 
pretty well accepted that as a general rule the average difficulty 
of the items in a test should correspond to the average ability 
of the subjects; that is, the items should be such that, on the 
average, about half of the subjects will answer them correctly. 
If, however, the test is to be used to select only a few outstand- 
ing subjects, it should be much more difficult; and if it is to 
Weed out only a few extremely poor subjects, it should be much 
easier. Mere appropriateness of difficulty does not guarantee 
the value of a test. Test difficulty may, in fact, be exactly as 
desired and yet the test be completely worthless for the purpose 
for which it is intended. Proper difficulty, then, is a necessary 

Ut not a sufficient condition. The test must be, in addition, 


reliable and valid. 


C. Test Reliability 

The reliability of a test refers to the extent to which the 
results of the test can be verified after a period of time or re- 
gardless of the particular items. Stated inversely, reliability 
refers to the extent to which chance factors affect test results. 


Several methods of estimating test reliability have been devel- 


ways of defining and 


3 Although ai designed special a 
en а few“ psychometricians have deg n serve most practical 


analyzing test and item difficulty, the concepts presented herei 
purposes. 


| 
i 


! 
206 EDUCATIONAL AND PSYCHOLOGICAL MEASUREIMENT 


oped. One of the common ones is the test-retest 1method, by 
which scores obtained on the first administration oft 4 test are 
correlated with scores obtained on a second adminis}tration of 
the same test. This method has the defects that if th e interval 
between the two administrations is too short, pracytice and 
memory factors make the estimated reliability їс high; 
whereas if the interval is too long, changed conditioiys—for- 
getting, variable opportunities for practice, and the like, -may 
make the correlation too low. A method which overcome s some 
of these defects is that of administering comparable or equiva- 
lent forms of a test at the same time and correlating the scores 
on the two forms. The difficulty with this approach 15 | that 
comparability has been inadequately defined. Tests which look 
alike, or in which pairs of items are matched throughout on ап 
inspectional basis, do not meet the requirements for a soun 

concept of comparability, a matter discussed more fully later 
in this paper. 'Improved methods of estimating test reliability: 
which overcome the difficulties of these classical techniques; 
have been developed recently but are outside the limits of this 
discussion. 

It is desirable that the reliability of a test be estimated 
before itis used. A study for this purpose is not always possible 
in the face of practical demands. Fortunately this does not 
mean that a test developed by competent psychometricians ап 
subject-matter consultants is likely to have a reliability coe 
cient of .00. Many of the factors that influence reliability are 
now known—the objectivity of the scoring, the type of item, 
the number of items, the lack of ambiguity of the items, and the 
independence of the items, to enumerate only a few. Hence 
tests often can be developed with considerable assurance that 
their reliability is reasonably satisfactory. To be certain that 
the reliability is as high as it should be or that it is as high 25 1 
can be made under whatever limitations of testing time, (Уре 
of item, and the like, are imposed, an experimental administt^" 
tion of the test followed by analysis of the results is needed. 


D. Test Validity 


It was stated earlier that regardless of the suitability of the 
difficulty of a test, it had to be not only reliable but also v4! 


PREDICTING JOB PERFORMANCE 207 


if it were to be useful. The test must be valid for the purpose 

for which it is to be used. The term validity should not be used 
| Ih а vacuum. A test satisfactorily valid for one end may be, 
| and often is, totally worthless for another. At this point the 
topic of test analysis dovetails with test construction, for the 
objectives that lead to the process of defining or delimiting the 
areas to be tested must be consistent with the purpose for which 
validity is desired. To the extent that the definition of areas 
to be tested is satisfactorily achieved, the test is valid for its 
Purpose. To demonstrate the extent to which the test is valid, 
Опе needs a measure of whatever was to be tested that is inde- 
Pendent of the test itself. One can resort to all sorts of sta- 
Ustical maneuverings with scores on a test and never fully 
establish its validity if access to an independent criterion is 
acking, Further, the criterion measures must themselves be 
sufficiently reliable to be worth bothering with. No test can 
Predict a criterion that has no reliability. This is one of the 


Serious problems in improving tests for employee selection. 
he typical evaluation of performance or service rating, с 
Used in both private and public agencies, 1$ at Worst useless an 
at best unsatisfactory as a criterion against which to assess es 
signed to predict job performance. Different raters pei 
fto observe Job discute күт for classification 


aboye » S roupe 
average." Positions £ there 

Purposes g ily differ in important Ways, $0 that 

i may actually them to be comparable or to 


is no Е 
Teason to expect ratings us should apply. Probably 


Suppose i 4 
that the same selection tes А develop egal 
th : : difficulty is to develop р 

е Most fruitful approach to us D can be validated. Such 


: erion measures against whic 
oe may advantageously b 
Ф. 9f which one and some an 


e broken down into components 
ther part of the test may pre- 
И avoided by the use 


А іпоѕ аге 1 
= Several pitfalls of service a to serve as a criterion 
i er 
À for “tings that have no purpose cu can be secured for each 
i 


Rube analysis. If several rat 

as jy the reliability of the сп 

th 80 often is for service ratings: eement amon 
“Is, those on which there is disagr 


terion need not be unknown, 


d if the ambiguous cases, 
: g raters, are 


208 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


excluded from the study, the chances for demonstrating that a 
test does in fact predict a reliable criterion should be markedly 
improved. 

The performance of would-be employees who have been 
screened out by whatever selection devices were used does not 
get rated at all. To the extent that this screening had validity; 
the distribution of employee performance is curtailed so that 
there may be little opportunity to demonstrate the value of a 
test that is very effective toward the lower end of the scale. 
As noted before, a test useful for discriminating among subjects 
at one end of the scale may break down completely at the other. 
This effect is determined by the difficulty of the test in relation 
to the group for which it is used. 


E. Special Problems of Prevalidation 

| Attempts to “prevalidate” a selection test (that is, to es 
lish its validity before it is used for actual selection purposes 
always face the problem of curtailed distributions of both test 
scores and criterion scores unless an agency is venturesome 
enough to hire all comers. This condition has been appro% 
mated during the war period, during which, however, the cut-0 
in the distributions probably has been transferred to the upper 
ends. Statistical techniques for “correcting” for curtailment in 
either the test variable or the criterion variable or both ar? 
available. Although the conditions for their application can 
not always be met, they at times enable us to approximate the 
validity of the test. 

Another problem in prevalidation is that types О 
particular test items that are valid for a candidate group 
differentiate not so well or even not at all among employ 
who through experience on the job may all have learned to e 
things that are within the province of only the best of the cand" 
dates. The experimenter is also faced with the need to protec 
the confidential nature of test materials if he expects to use the 
final test in ways that have important bearing on persons’ Jiven 
“Leakage” destroys not only the confidence of the candidat? 
but also may nullify whatever validity the test otherwise wou 


have had. 


tab- 


f tests 07 


тау 
ees, 


— P^ 


й. 


PREDICTING JOB PERFORMANCE 209 


F. Analysis of Individual Test Items 

From analysis of the relationship of a total test to a cri- 
terion, the next step is to investigate the relationship of each 
item to the criterion. Thus, by any of a number of statistical 
techniques, all more or less close approximations to correlation 
coefficients, one estimates the correlation of each item with the 
criterion. He then discards or tries to improve those with little 
or negative validity and attempts to discover the character- 
istics of the more valid items so that he can construct additional 
ones that are at least as good. The statistical analysis may 
even be carried a step further, to investigation of the validity 
of each choice in a multiple-choice item. Those alternatives 
Which discriminate in an unexpected way are then revised or 
replaced. "There are still further item analysis methods for 
taking into account the interrelationships of the items as well 
as their relationships with the criterion. 
_ Because of the difficulty of obtaining reliable and useable 
Independent criterion measures, many experimenters have re- 
Sorted to an "internal" criterion, which simply means the total 
Score on the test itself, in an effort to improve written tests. 
Let it be clear that this is not a method of validating a test. 
If the test as a whole does not have validity, this process can 
never yield it. What the process does do is to select items which 
tend to measure whatever the test as а whole measures. If the 
test is measuring what it is intended to measure, then the least 
valid items can be culled out by this device. One further word 
of caution is needed: «validation" against an internal criterion 
may lead to very erroneous conclusions if the total test is 
measuring more than one factor, as is typically true of employ- 
Ment tests. The process in inexperienced hands may lead to the 
Selection of items of more and more homogeneous content, with 
the result that coverage of the finally selected items 1$ so nar- 
Towed that actual validity is reduced. Briefly, the solution is 
to set up separate internal criterion scores for each factor pres- 


ae 
nt in the total score. 


G. Equivalent Forms of a Test 


f Mention was made earlier of С 
orms of tests. In a large-scale testing pro 


f comparable or equivalent 
gram there is need 


210 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


for interchangeable tests if one wants to be able to compare the 
results of a test given to the same individuals more than once 
or to large groups of individuals at different times, in order to 
minimize the effects of practice and leakage, respectively. 
Items paired off according to difficulty and apparently in the 
same field of knowledge or requiring the same ability do not 
insure the comparability of two forms of a test. The tests may 
not yield the same type of score distribution; and the items may 
not in fact be in the same field, so that the tests would not be 
sufficiently highly correlated to be treated as interchangeable. 
An experiment can be set up that will yield comparable tests, 
although the conditions for such an experiment are not always 
administratively feasible. Satisfactorily equivalent forms of a 
test can be developed by administering to a single group of at 
least, say, 200 cases, somewhat over two times as many items 
as are needed for a single form of a test and by ently 
practice effects and the sampling of the population for whic 
the test is to be used. In the interests of expediency, certain 
approximations may be made. If, for example, one can assume 
two groups of subjects to be equivalent with respect to the 
knowledge and abilities tested in the test on the basis of infor- 
mation other than their performance on the test, he can ad- 
minister one of two forms to one of the two groups and the other 
form to the other group and adjust the tests somewhat 1” 
accordance with the results. He has made a big assumption» 
however, and he can never obtain the correlation between the 
two forms by this method. 


H. Weighting the Components of a Test 


In combining parts of a written test to get scores on the 
total written test, and in combining written test scores, age 
of education and experience, and oral interview scores to т 
scores on a broader examination process, the question of EL 
importance or weight to be attached to each part arises. e 
either type of combination, if reliable external criterion ue. 
were available for prevalidation, the correlation of each P? 
with the criterion and the intercorrelations of the parts cou К 
be computed. Then а technique is available (multiple regie" 
sion analysis) that would tell us at what weights the part 


PREDICTING JOB PERFORMANCE 211 


should be combined in order to yield the maximum correlation 
with the criterion. Combining the parts at these optimal 
weights, instead of at weights set up to accord with personal 
opinion, gives the highest possible predictive efficiency to the 
particular selection devices used. 

Lacking an external criterion, as we so often do, weights are 
established to reflect what is thought to be the importance of 
the several parts. The exactitude of the operation of such 
weights, however, is more apparent than real. This is true 
because the effective weight of one variable in combination with 
others is dependent on its variability and on its correlation with 
the other variables. Other things being equal, the part which 
has the highest standard deviation or the highest correlation 
with other parts has the greatest influence in determining the 
relative standing of the subjects on the total test. In some 
instances, then, it is desirable to adjust the scores on some 
variables to equate variability and to take into account the 
intercorrelations of the variables before assigning the “arbi- 
trary” weights that are supposed to reflect their importance. 

Ss 7 vL 

In this discussion of the field of test construction and analy- 
sis, the purpose has been to indicate clearly the high spots 
without obscuring the complexities of some of the problems 
that arise. An attempt has been made to show the types of 
short-cuts that may be profitable, the advantage that may be 
taken of previous experience in the field of testing, and the types 
of approximations to which resort may be made in conducting 
analyses of test results. The close interrelationship of test 
construction and the analysis of test results has been stressed 
throughout. Ideally, one would never undertake to construct 
a test without planning an experiment to demonstrate or 1m- 
Prove its value; and one would never conduct a research on 
test results without constructing а better test thereafter. Al- 
though the model test has perhaps not yet been developed and 
the perfect research project not Yet completed, we are in a 
better position to improve test construction practices and to 
adapt our research procedures to the exigencies of the moment 
if we have an awareness of what our goal should be. 


m 


THE USE OF OBJECTIVE ACHIEVEMENT EXAMI- 
NATIONS IN A NAVAL TRAINING 
PROGRAM 


LIEUTENANT COMMANDER D. D. FEDER! 
USNR 


Introduction 


it ыз the expansion of Naval training after Pearl Harbor, 
riel сы necessary for the Navy to supplement its radio mate- 
gilles ools by acquiring facilities of civilian trade schools and 
m bie which could easily and quickly convert their programs 
mu s needed by the Navy. These schools were given an 
trai ine of what the Navy's program had been in the electronics 
Ming field, but this was necessarily limited because for the 
Tst time it became necessary to separate highly classified 
EUM from unclassified, and to make up the latter into a 
ша of fundamentals which could be taught freely by 
sc Шап staffs, From this outline engineering college and trade 
a ад faculties were required to formulate a curriculum gov- 
t Ni the fundamental concepts of radio as exemplified in a 
oe i knowledge of Ohm’s Law, alternating and direct cur- 
Ban general communication circuits, radio power supplies, 
trical machinery and rotating power supplies, and finally, 
,."damentals of radio as exemplified in an understanding of the 
Ymptoms and causes of various radio difficulties. The goal of 


S s ч ar 
el training is to produce Electronic Technicians Mates (for- 
епу called Radio Technicians) competent to service any and 
itters, receivers, 


all e] 
ешш i i transmi 
ete, ronic gear including radar, sonar, 
= 
re those of the writer and 


1Th 25 
аге по te and assertions cont: 
E the naval «, construed as official or m 
LE. Service at The author desires 3 
tribut, Almstead, Lt. Cae wer Lawrence and Ens. Louise E. Gettys, who con- 
materially to the work herein reported- 
213 


214 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


This division of training created Pre-Radio and Elementary 
Electricity and Radio Materiel schools (E.E. and R.M.) and 
the Advanced Radio Materiel schools, the last-named being 
staffed exclusively by naval personnel and dealing with classi- 
fied equipment and documents. 

It will be readily understood that with only a bare outline 
for guidance, the varied backgrounds of civilian education could 
not help but result in the development of a variety of programs, 
all well-intended but each, almost of necessity, producing an 
end product somewhat different from that of the others. This 
lack of uniformity was a constant source of difficulty for the 
advanced schools. They regularly found it necessary to spend 
their first month of instruction in review and even first teaching 
of certain fundamentals in order to make sure that each man 
had the necessary preparation to undertake the advanced 
curriculum. 

In December 1943 the Navy undertook a comprehensive 
program for standardization. This was aimed first at the Pre- 
Radio and Elementary Electricity and Radio Materiel (Е.Е. 
and К.М.) schools. Previous curriculum studies were drawn 
upon, practices of existing schools were studied and some re- 
search was done on the needs of the advanced schools. Out of 
this work the curriculum and laboratory outline for the two 


introductory phases were set u d pl i ion in 
p an aced in operation 
March 1944, j d 


Various materials were prepared and were still in process UP 
to the end of the war—all with the intention of securing better 


standardization of training output. Among these was a pro" 


gram of comprehensive final examinations for both the Pre- 
Radio and E.E. and R.M. schools. This report will deal with 
the E.E. and R.M. situation, since it is more complicated an 
its curriculum represents more new learning. 

From the outset the achievement examinations were 16" 
garded as an integral part of the new program. It was felt that 
no matter how detailed curricular explanations might be, the 
schools could get their most direct leads as to the type of train- 
ing and understandings the Navy deemed necessary from t 4 


E 


AUN съ 


OBJECTIVE ACHIEVEMENT EXAMINATIONS 215 


examination hurdles it set for graduates of the program. To 
this end each examination attempts to give a comprehensive 
sampling of the three-month content. The six forms together 
are believed to cover almost all of the functional content of the 
course. 

Description of the Examinations 


Part I of each examination yields a single score covering the 
first and second months’ work. Because Part II was designed 
to serve in lieu of the schools’ former third-month examination, 
as well as the comprehensive final examination, it was divided 
Into four parts representing the courses of instruction in the 
third month—Communication Circuits, Power Supplies, Elec- 
trical Machinery, and Fundamentals of Radio. Since this 
examination was designed to serve a diagnostic function also, 
а separate score is derived for each part. A three-hour time 
limit was established for each part, permitting nearly all men 
to finish each part of each examination. Since emphasis at this 
level is upon the development of understanding of fundamen- 
tals, it was felt that the speed factor should play a minimum 
Tole in determining scores. 

All items are five-response multiple-choice type. A studied 
effort was made to make the items as completely functional as 


Possible. For example, in the first and second months, mathe- 
Matics fundamental to alternating current is studied. The 
are essentially electrical, but 


examination problems, however, 1 
50 devised as to sample almost all of the mathematical skills 
Which the trainees should have mastered as the basis for pur- 
Suing the advanced school studies. Routine definitions. and 
Memorizable solutions were minimized with the emphasis on 
Problems demanding reasoning with the facts learned. "Typical 

iagrams and schematics such as will actually be encountered 
aboard ship are employed. This type of item is best seen m 
Certain forms which employ the complete schematic diagram 
ofa five-tube superheterodyne receiver similar to the one the 
men build in their laboratory practice. The series of problems 

ased on this diagram includes the location of typical causes 
of faulty operation, the prediction of faulty operation which will 
Tesult from Ba sor typical equipment failures, etc. This type 


j 
216 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT И, 


i itu- 
of item is believed to approximate closely actual — 
ations encountered and reasoning demanded under ship 


Д e of 
conditions. Some examples of this “trouble-shooting” type 
item are shown below. 


' i n tifier 
In Figure XVII, the output voltage is zero. A voltmeter nice phe rec 
filament and ground indicates a normal voltage. The trouble is cause у: 


1 R 2 

à center tap of high-voltage winding not grounded 
(3) open L 

(4) bad rectifier tube 

(5) Cy shorted 


Bt 


56000000 


Figure XVII 


In Figure XVII, the plates of 


the rectifier tube h 
is closed. The cause of this troubl 


i itch 
eat excessively when the swit 
e is: 


(1) short between turns of L 
) open C, 


(3) bleeder resistor open 
(4) leaky C, 


(5) open high-voltage secondary | 
Я А ive | 
In Figure XVII, the output voltage is lower than normal and has excess! 
ripple. The cause of this trouble is: 


(1) open R 
(2) open L 


(3) gassy rectifier tube 
(4) shorted turns in iU 
(5) open С, 


Statistical Data 


£u : ‘onal 
Statistical treatment of each test has included convention4 


item analysis (biserial correlations and difficulty values), calcu- 
lation of reliability coefficients by the Kuder-Richardson e^ 
mula, calculation of part-whole and interpart correlation coe 


я = ak Б : t 
cients, preparation of norms, and validity studies using firs 
month advanced school grades as the criterion. 


OBJECTIVE ACHIEVEMENT EXAMINATIONS 217 


Unless otherwise indicated all statistics are based upon rep- 
resentative samples of 200 cases. 


TABLE 1 


Reliability Coeficients of the Е.Е. and К.М. Final Achievement 
Examinations 


Form R 


1 Revised 85 


Aue 
to 
ч 


Bearing in mind the tendency toward underestimation of 
the Kuder-Richardson formula, and the fact that these are 
Power tests, the reliability coefficients in Table 1 may be con- 
sidered satisfactory. Items in each form were selected to pro- 
vide a good range of difficulty values. Beginning items are 
solved by 90 to 100 per cent of the men. Approximately two- 
thirds of the items on each form are answered correctly by 50 
Per cent or more of the men. Only a few items are missed by 


as many as 75 per cent of the trainees. 


TABLE 2 


Difficulty Values of the E.E. and R.M. Final 


Average тва and 
Achievement Examinations 


Form Average feis Average D 
1 i .36 54.3 
P Revised SA 265 
3 .38 81.5 
4 42 73.8 
5 39 76.4 
6 40 77.7 


From Table 2 it will be noted that Form 1 Revised is the 
Most difficult. Despite improvement in average scores as a 
Tesult of improved instruction, this form remained somewhat 
More difficult than the other forms. Despite the generous time 
imits for admjnistration, all forms yielded essentially sym- 
Metrical, bell-shaped distributions. 


То: determine the validity of the part-score breakdown of 


218 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Part II of the examinations, interpart and part-whole correla- 
tions were computed for each form. All of these were so closely 
similar as to warrant their averaging for purposes of summary 
in Table 3. 

The interpart correlations are consistently low, whereas the 
part-whole correlations, in most cases, are relatively high. 
Thus, the use of part scores seems to be warranted. In addi- 
tion, each part clearly contributes some relatively independent 
measurement to the total score. The generally higher coeffi- 
cients of correlation between Parts Па and IId reflect the 
greater similarity of material. It is generally considered that 


TABLE 3 


Part-Whole and Interpart Correlations for the E.E. and R.M. Final 
Achievement Examinations 


Total 
Ранї Part IT Па IIb Пс па 
Part II 49 
Ila 
B f 
T 77 92 % D at 19 


the difference between Communication Circuits and Funda- 
mentals of Radio is one of emphasis rather than content. 

As previously indicated, the Е.Е. and R.M. schools are inter- 
mediate to and preparation for the work of the Advanced Radio 
Materiel Schools. Therefore, a major criterion of the effective- 
ness of the E.E. and R.M. examinations may be observed 1” 
their ability to predict marks in at least the first month of the 

' advanced schools. In analyzing the correlations in Table 4 * 
should be noted that critical scores on the E.E. and В.М. © 
aminations were established at the outset, and that the range? 
represented in these statistics are therefore restricted. Thus 
the coefficients may all be considered to be artificially low- 
the present time, these validity statistics are available only fF 
the first four forms, Similar data are being collected for the 
two later forms. The first-month grades used here were thos? 
ju. after the advanced schools revised their curriculu™ 


5, Sag 
A S Netz € 


| 


OBJECTIVE ACHIEVEMENT EXAMINATIONS 219 


TABLE 4 


Intercorrelations of First-Month Advanced School Grades and Scores on the 
E.E. and R.M. Final Achievement Examinations 


School Form 1 Form 2 Form 3 Form 4 
Bellevue: icis ганаа 59 55 64 66 
hicago .... 38 52 53 65 
Corpus Christi 49 54 54 60 
reasure Island . 52 59 .69 70 


So that the former review functions of the first month were 
eliminated. 

. These validity coefficients are well within the limits con- 
Sidered satisfactory for educational prediction. 


Use of the Examinations in Improving Instruction 


_ At the outset reports were made to schools after each test- 
Ing. Then, as the effects of these reports became noticeable 
their frequency was reduced. Each report consisted of an item 
analysis in percentages showing each school how the perform- 
ance of its graduates compared with that of all graduates com- 
bined. Since all schools graduated on the same day and were 
Tequired to administer the same form to a given graduating 
class, these comparisons could be made directly. | 

The first of these reports was followed up by a visit to e 
School, and indoctrination discussions with faculties on the use 
dud results. They were shown how in- 
ld be located and overcome. Im- 
out, and failure to comply with 


ach 


interpretation of the 
Structional weaknesses cou 
Proper emphases were pointed l 
the official curriculum could be located easily. — 

For approximately three months, Form 1 Revised was used 
exclusively, On the successive administrations the scores in- 
Creased steadily, the average rising from 70 to 95. This increase 
*d to the belief that this form had been compromised, either 
Via the student grapevine or by consciously or unconsciously 

ест instruction for it. Therefore, as soon as Forms 2, 3 ue 

Were available, they were immediately rotated, ES Form 

vised was not given for about two months. W en it ie 
administered again, the scores still reflected positive growth, 
With the mean now up to 97. There was no change in basic 


220 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


personnel selection procedures, and actual checks on successive 
classes indicated that personnel quality was relatively constant. 
Therefore, the steady upward trend of scores, which continued 
after additional forms of the test intervened, is interpreted to 
mean that instruction had become directed toward the curricu- 
lum objectives and that any loss of security of the examinations 
was negligible. 

Successive item analyses over a period of approximately a 
year have brought to light the actual changes in instruction 
made by various schools. With the overall average as a mini- 
mum target to shoot for, schools have made those modifications 
necessary in order to bring their graduates up to or above the 
level of all graduates. As a result of this directed instruction, 
many areas were so well taught that the items on them became 
too easy and hence lost their discriminatory power. In reyi- 
sions which the tests are now undergoing such items are being 
studied with reference to the possibility of omitting or changing 
them, bearing in mind, of course, that even though an item may 
lose its discriminatory power, it may nevertheless be valuable 
as a guide to instruction, and therefore should be left in. 

Since a studied effort was made in the construction of these 
examinations to include as distracters only such responses 28 
had some degree of meaning, or which experience had show? 
to be the most frequent errors made, the attention of instruc- 
tors was directed to the importance of analyzing any and al 
examinations in order to identify characteristic errors. This 
information was provided through the item analysis from the 
Bureau of Personnel, but each school was shown how to make 
similar analyses on the spot, since an opportunity for two t? 
four days of remedial instruction was always available before 
the graduates were detached. This attention to characteristic 
errors was also instrumental in aiding the relatively inexperi 
enced instructors who had to be used because of the emergency: 

It has been possible to trace directly the effects of these 
examinations and the procedures employed with them upon 
the instructional program.’ In addition to the large amount 9 
statistical support obtained, numerous reports from school 
officials indicated that the examinations provided them with 


OBJECTIVE ACHIEVEMENT EXAMINATIONS 221 


guideposts which enabled them to interpret more fully the 
official curriculum. The sampling of instructional material 
accomplished is so complete, that if schools do teach for the 
examinations, they will, of necessity, teach the desired materials 
With the desired functional-practical emphasis. 

Perhaps the most convincing evidence of the standardiza- 
tion and improvement accomplished through the introduction 
of standard curriculum and examinations comes from the re- 
Ports of the advanced schools. They are no longer able to 
identify men by the E.E. and R.M. schools they attended be- 
Cause the products are all so uniform that the former individual 
School training peculiarities have disappeared; furthermore, 
improvement in student achievement after about six months 
of this program was sufficient to eliminate the need for exten- 
Sive review in the first month of advanced school, and to permit 
the immediate undertaking of advanced instruction. 


7) 


VALIDATION STUDIES ON JOB INFORMATION TESTS 
D. WELTY LEFEVER, ALICE VAN BOVEN, and JOSEPH BANARER 


San Bernardino Air Technical Service Command 


Tue Personnel Testing Unit at the San Bernardino Depot 
of Air Technical Service Command was assigned the task of 
developing, administering, and interpreting several varieties of 
Measuring devices to meet the needs of the civilian training 
Program, the military training program and the operating divi- 
sions of the Depot. It was decided to specialize in the con- 
Struction of job information tests since practically no appropri- 
ate tests were available in the trade areas related to the repair 
and maintenance of airplanes. Job information tests were de- 
Veloped for 97 different jobs or occupational areas and most of 
these were revised at least once. The four choice best answer- 
Туре of item was adopted as standard form while the length of 
the test varied from 75 to 100 items. 

hile research per se could not be stressed in a war emer- 
8ency program, it was thought highly advisable to do every- 
thing practicable to validate these job information tests. An 
"Ssential first step was to determine the reliability of each test. 
"here the samples were sufficiently large the split-halves tech- 
nique was employed, including the usual correction by the 
аррісаноп of the Spearman-Brown formula. For 31 tests thus 
Checked the reliability coefficients ranged from .62 to .95 with 
а median value of .87. When nine of the most recently devel- 
Oped tests were checked, the median coefficient proved to be 91. 
nese results may be considered fairly satisfactory, especially 
Since the later tests written by more experienced technicians 
exhibited higher reliability than those constructed in the early 

ауз of the unit. 
he Personnel Testing Unit 


t = ge 
hroughout the history of its efforts їп ЈО 
223 


has been greatly concerned 
b information test con- 


224 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


struction over the lack of adequate criteria for establishing test 
validity. Basically, of course, it may be argued that acceptance 
of the test items by qualified experts in the work areas consti- 
tutes a source of test validation. This is probably true although 
“coverage” is by no means assured. However, evidence is defi- 
nitely needed which will show that the kind of items used ina 
paper-and-pencil instrument will measure abilities actually sig- 
nificant to job success. Production records appear to be 1m- 
practicable for this purpose because of the great variety of work 
assignments and the difficulty of placing each kind of product 
on a comparable scale. Whether measuring devices of the type 
here discussed can ever be completely validated is a serious 


TABLE 1 


Summary of Correlation Coefficients Between Job Information Test 
Scores and Criteria of Validity 


Number ; 
Criterion of Lowest Highest Median 
Samples 

Instructors’ grades .................. 9 05 85 A 
fficial efficiency ratings ............. 8 -.03 62 А 

Special rating scale for sheet metal 42 
КОШЕ: аар a, 1 T e К 

1 й Ыр .54 

1 И T .53 

1 . " 66 

11 39 62 ds 


question. It is entirely possible that the tests are and will 
remain superior to any criterion obtainable. 
A number of criteria were employed as sources of evidence 
the validity of the job information tests. These included 
ructors' final grades in training courses, efficiency ratings, 
foremen’s ratings on especially developed scales, Civil Service 
grade designations and follow-up studies in which the personne 
records of high-scoring and low-scoring workers were comparet 
All but the last-mentioned criterion were correlated against JO 
information test scores. The coefficients of correlation are summ 
marized in Table 1. . 
During the early history of the San Bernardino Depot 
several thousand civilians were trained for aircraft work in ? 


for 
inst 


VALIDATION STUDIES 225 


prograr i i 
ntis ie de age я rct pues Exc o 
valises es n. One of the few criteria available for test 
Foie amid ose was the final grade given the me- 
Maie enn DE y his instructor. The coefficients of correlation 
Pe kr i 05 to 85 with a median value of .42. This median 
oS a not ee but the evidence for valid- 
у Жим. гу i usive since it is quite possible that both the 
айс à p and the scores on a paper-and-pencil test 
mutis, а m y verbalized approach to a mechanical occu- 
Vibe е aps the test and the judgment of the instructor 
The к removed from actual shop conditions. 
Ба eem ratings required by War Department regu- 
абр» den considered as possible criterion measures. These 
tolesne = мерен to be preeminently satisfactory in this 
Certainly "a co other than job information are included. 
record, ability е А пане age alertness, attendance 
а инша СЫ *- g e supervisor and fellow 
efficiency among the e ements rated by the foreman. The 
the high ae exhibit a strong tendency to accumulate at 
ies E of the scale, constituting a serious weakness (at 
the € 5 eee point of view). Observation indicates 
Were cules ity that purely extraneous human relations factors 
Biker [к in a considerable number of instances. 
шры E able 1 shows that for eight correlation coefficients 
Scores the etween efficiency ratings and job information test 
to'a sube, median value is.33. The range runs from nearly zero 
cate thes antial positive value, .62. The evidence seems to indi- 
М efficiency ratings measure many factors not included 


In the ; è Я 
€ job information scores. 


I Р А ? 
D order to focus attention directly on job performance aside 


sonal elements present in 


point scale was 
were actually used: “fairly 
“very satisfactory.” The 
nate any of their workers 

The correlation between 


Wi Е 
t ett work. Although a five- 
Cities a only three categories 
Oremen ory, satisfactory, and ‘ 
as “unsa apparently hesitated to desig 
tisfactory” or “outstanding. 


226 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


these ratings and the job information test scores proved to be 
42. This is somewhat better evidence of validity than was 
obtained by use of the official efficiency ratings. | 

A more satisfactory criterion was obtained by evaluating 
certain charts kept by the Civilian Training Branch. These 
charts were designed to indicate how many specific operations 
each worker is qualified to perform, without supervision, with 
partial supervision, or under close supervision. The work in 
sheet metal was subdivided into some 40 operations, such as 
the operation of a certain machine, or the proper use of a 
variety of tool. These data were assembled for 55 workers in 
one unit of the Sheet Metal branch; the summaries were then 
correlated with the job information test in Sheet Metal Repair. 
The training data were translated into a weighted score by 
counting four points for each operation the worker was qualified 
to perform without supervision, three points if partial supe! 
vision was required, two points for jobs in which the worker 
required complete supervision and one point for jobs in which 
training had been barely started. The correlation coefficient 
between these training data and the raw score in the job infor- 
mation test was computed to be .54. The reader should bear 
in mind that these workers were engaged in actual productio? 
activities and were not trainees. 

Since the sum total of the specific operations comprising the 
work of the unit may be considered to be the maximum wor 
competency for that unit, a special measure representing the 
extent to which each worker had mastered that competency 
constitutes a valuable criterion for validating the job informa" 
tion test. It must be recognized, of course, that the chart 9 
operations presents a summary of both job information an 
skill. In the light of this fact, the correlation obtained may Рё 
judged quite satisfactory. 

Valuable evidence of validity was obtained when the supe! 
visors of 79 radio repairmen rated the workers on a specially 
devised experimental scale consisting of the following elements: 


1. Knowledge of theory. 
2. Quantity of work, 
3. Specifications of product, 


t 


VALIDATION STUDIES 227 
4. Neatness. 
5. Care of equipment. 
6. Thoroughness. 
7. Understanding of schematics. 


Weighted ratings for all of the seven elements that were 
correlated with the job information scores yielded a coefficient 
of .53. When two elements, “knowledge of theory" and “under- 
Standing schematics,” were selected as representing more nearly 
the content of the test, the correlation between average ratings 
and job information scores rose to the highly satisfactory value 
of .66. Here is definite evidence of the validity of job infor- 
Mation tests. It is not likely, however, that many work areas 
will produce such high validity measures since they lack the 
Organized body of detailed trade information which must be 
mastered by the skilled radio repairman. 

Perhaps the most consistent and practical criterion for 
Validating the job information test is the Civil Service designa- 
tion of each employee. fitis assumed that these workers were 

ired in harmony with their experience and evidence of skill and 
that they were advanced in accordance with their growth in 
Competence, then the Civil Service grade designation may be 
taken as a criterion measure for determining the validity of job 
Information tests. Correlation coefficients computed for the 
relationship between job information test scores and the grade 

€signations range from .29 to 62 with a median value of 52: 

5 а validity coefficient a correlation of .52 may be judged 
rather Satisfactory. In other words, such a correlation consti- 
tutes evidence that the job information test measures many of 
e same elements which were considered when the worker was 


ved and promoted. 
ee way evidence was obtain 
job ; S of workers of different gra 
information tests constructed 
гла scores in percentages for 
Were: 


ed by comparing the mean test 
de designations. For 35 of the 
by the Personnel Testing Unit 
workers of different designa- 


Helpers 
РР ЗИТ АТЫ аа 


228 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Journeymen ................ 69 
Seniors 


The lowest passing score for workers in each grade level was 
assigned by the making of a grade chart which was a compro- 
mise between straight percentage marking and strict adherence 
to the normal curve. The mean minimum passing score in per- 
centage for the same 35 tests was: 


I: M 34 
Janion аена 45 
Journeymen ................ 56 
Seniors .................... 67 


An analysis of the reasons given by workers who separated 
from the Depot did not furnish as clear-cut evidence for the 
validity of the job information test as was desired. A few sig- 
nificant clues were obtained, including the fact that of those 
former trainees who left to go to school, more than twice 25 
many had test scores in the highest decile as in the lowest. The 
test scores of those discharged for misconduct averaged muc 
below normal. In general, the obviously poor reasons for вера” 
rating were usually accompanied by lower test scores, but most 
separations appeared to have little relationship to job succes: 

A series of follow-up studies made a number of months after 
job information tests were administered revealed a definite 
trend in the personnel actions occurring during that perioc 
Workers who received promotions had made higher scores tha? 
those remaining at the same work-level. 

The first of these follow-up studies was based on the scores 
and the personnel records of civilian trainees who were require 
to pass job information tests after the completion of their trat" 
ing to become eligible for promotion to the helper level, at 
which time they were transferred to the Maintenance Divisio? 
Job information tests were administered in the Civilian Tram- 
ing Branch in 1943, from January through November. During 
that time 1,452 tests were administered to civilian trainees !? 
42 different courses taught at the school. (Tests administere 
to off-reservation trainees were not included in this study- 


27 


VALIDATION STUDIES 229 


sen aniy was made approximately one year after 
job ia of civilian training classes in mechanical work. The 
часа. mation test scores of the trainees in the 42 classes were 
est 10 y reviewed, and approximately the highest and the low- 
оа пе of the trainees in each class were selected for 
hebe p. The employment records of the trainees who made 
Фе ны оп the job information tests were compared with 

istories of those who failed or nearly failed the tests. 


Bira TABLE 2 
onnel ‘Records of 144 High-Scoring and 144 Low-Scoring Workers as Determined 
y the Results of Job Information Tests Administered 12 to 21 
Months Before the Date of Check-up 


Number Number 
of E Critical Чылда 
high ow ratio 
H scorers scorers 100° 
*lpers or Juni ч x 
f r Juniors doing the kind 
сої Work for which trained =+.: - 11 28 30 99.9 
work fen or seniors doing the 
Foren, Lor Which trained ...-++++ 26 9 31 99.9 
Cassi D, instructor or engineer . 4 0 2.0 98 
Tansee uL IE 17 16 2 58 
Min’ to other army station 14 9 LI 86 
Sepa, ary furlough .......... 10 7 8 79 
lu pc ANNUM 62 75 15 93 
* 
repeargp ances in 100 that there will be a discrimination in the same direction upon 
Use of these tests under similar conditions. 


ble 2 show that more of the low- 
Peer var rr iini кь фе с = 
Position of the high-scoring train re p um = 
s. Моге of the low-scoring trainees ad separated. 
inse," Percentage of high-scoring workers transferred to other 
tisti ations. The column of critical ratio values indicates sta- 
ically reliable differences for the two major comparisons for 
conclu, regular Civil Service grade designations. It may be 
Value n therefore, that the job info 
selecting employees who are capa 


Tes pene Á Yen 
Ponsibilities in the shops and in pointing 9" 


Who are | 


The data presented in Ta 


Scorj 
in : А 
& trainees remained at 


ess likely to justify promotion. 
: group of 452 workers in the Sheet Metal Branch became 
Subjects for another follow-up study- A check on personnel 


230 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


records was made three months after the job information test 
had been administered. A summary of the record is presented 
in Table 3. The workers are classified by score intervals regard- 
less of their Civil Service designation. It may be noted from 
the column totals that 20 per cent of the workers had been pro- 
moted, 20 per cent had resigned, 2 per cent had been discharged; 
3 per cent had received merit increases, and 55 per cent re- 
mained in the same grade designation and step as when tested. 


TABLE 3 


Summary of Personnel Actions Over a Three Month Period for 452 Employees 
Grouped According to Raw Score on the Sheet Metal Repair Test, 
Regardless of Civil Service Designation 


Percent percent 


Percent Percent Percent merit ynchange 


resigned discharged promoted 


increase 
Of 2workersscoring90-100 .. 0 0 50 0 2 
Of 43 workers scoring 80- 89 .. 12 0 26 5 HI 
Of 117 workers scoring 70- 79 .. 19 0 24 6 66 
Of 134 workers scoring 60- 69 .. 17 0 14 3 50 
Of 104 workers scoring 50- 59 .. 25 1 21 3 51 
ОЕ 41 workers scoring 40- 49 |. 20 12 17 0 28 
Of 11 workers scoring 30- 39 .. 45 9 18 0 55 
Mean percentages ........... 20 2 20 3 
TABLE 4 


Analysis of Follow-up Data in Table 3 to Determine Significance of Differences 


Я Promoted or 
Test Score Dee Unchanged Given Merit Totals 
80-100 5 26 14 45 
50- 79 72 200 83 355 
30- 49 19 24 9 52 
Totals 96 250 160 452 


f. (corrected for continuity) = 8.97 


The findings were no doubt influenced by the rules restricting the 
number of merit increases that could be granted in any mont 
and some of the workers had not served the six months neces” 
sary to be eligible for an increase. ; 
| Table 3 reveals a tendency for more promotions and merit 
increases to accompany better scores and for resignations 4? 
discharges to be associated with poor scores. The statistic? 


VALIDATION STUDIES 231 


енна of this tendency was determined from the analysis 
| own in Table 4, which indicates that about six times in one 

undred a chance distribution would deviate as far from inde- 
Pendent values as does the one in this table. 

; A second follow-up study of the same group nine months 
alter testing indicated that most of the workers had been pro- 
moted or received merit increases within the interval. Their 
Personnel records are summarized in Table 5. The statistical 


TABLE 5 


P, А 
ersonnel Actions Taken Within Nine Months After Administration of the Job 
Information Test in Sheet Metal Repair, Classified by 
Designation and Test Mark 


P č Per cent 
Бисер d remaining on Per cent 
Grade designation рршоге same level separated 
and mark HE = oe тыт 
5 erit о errei 
Twice Once increase change 
Of 
Of 2 helpers who received A.. 38 18 2 ~ 42 
Of g7 pj Pers © н B.. 4 39 a 57 
RI helpers « « cz 1 3 i i 63 
Or рез « «р 31 е 2 67 
helpers и а Е. 25 E " 75 
Of 27 1910ге « « А, uo 3j 33 22 
40 Juniors «а Won Э 33 4 30 
Of 3] 19110тѕ н NEC 2 15 25 7 50 
Or 1p Wiors * « p 10 35 10 45 
Gye ч «к. 30 20 50 
Or 5 Ј00гпеутеп * — « P 25 75 ЕЕ 
Of 22 0Urneymen * — « am 57 14 28 
Journeymen “ — « Che lt 40 5 36 
Or 9 0шпеутеп« — « D. 13 50 19 19 
2l'urmneymen * — « É. 11 45 22 22 
Of 2 Seniors «а с 100 X 
seniors “в n 100 Rs 5 
seniors « “ E. 50 55 50 


Sign; EE ; 
Enificance of the various comparisons implied by the т e 
. The 


© . 
eb has been computed and is 
tion marks (A, B, C, D, and E) were с 

v test scores and have approximately the usual meaning of 
и equal divisions of the base line of a normal curve. They 
"© determined for each Civil Service grade designation 
Parately, 
able, gain there is definite and reliable evi 
Personnel actions tend to accompany t 


Se 
dence that the favor- 
he high scores on the 


232 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 6 


Analysis of the Follow-up Data in Table 5 to Determine Significance of Differences 
Those receiving double promotions 


Test mark Number Per cent 
A 20 7? ee 
B 3 12 Critical ratio between proportion receiving 
C 3 12 A and all others equals 
D 0 0 2.8 
E 0 0 Significant at 1 per cent level 
Total 26 101 


Those receiving single promotions 


A B . Ë | niig 
B 37 33 Critical ratio between proportion rerna 
C 38 34 A and B and those receiving D P 
D 19 17 equals 
E 5 4 22 
Тога1 112 100 Significant at the 5 per cent level 
E 
Favorable personnel A plus B E D plus 
actions marks omens marks 
Double promotion -2 .............. 23 3 0 
Single promotion -1 .. 50 38 22 
Merit increase –.5 .. * 20 20 2 
No change -0 .......... 2 5 11 
Mean number of favorable personnel 6l 
actions 1.12 .82 34 
Standard deviation е 50 32 043 
Standard error of the mean 056 040 : level. 
Critical ratio for A plus B vs. C equals 4.5. Significant at the 1 per cent level. 
Critical ratio for C vs. D plus E equals 3.6. Significant at the 1 per cent 


job information test and that fewer promotions and айтай 
in рау аге in store for those workers who make low scores- T f 
comparisons made in Table 6 show a uniformly high degree Fa 
reliability. The interpretation at this point should be + 
counted to a slight degree because the test results were J} i 
beginning to be consulted by an occasional supervisor bem 
recommending a worker for promotion. The job informatie 
tests at this stage in the history of the San Bernardino Геро 
were not generally recognized as valuable evidence in deciding 
personnel actions. Probably not more than ten per cent of t 
promotions made involved a consideration of test scores: 


Summary 


The evidence for the validity of the job information all 
may be listed briefly as follows: 


-—— 


Ў 


VALIDATION STUDIES 233 


1. When the instructors’ final grade in civilian training 
classes was taken as a measure of the trainees’ success, the cor- 
relation between job information scores and grades was fairly 
Satisfactory (median .42). At best these final grades consti- 
tute a rather makeshift criterion of validity. 

_ 2. The official efficiency ratings were even less effective as 
criteria; the median correlation coefficient with job information 
Scores was .33. When the many “human” factors affecting 
these ratings are considered, perhaps the resulting correlations 
are not exceptionally low. 

3. When the chart of specific job operations prepared as 
Part of the on-the-job training program was correlated with 
Job information test results by means of a special summarizing 
Score for a group of 55 workers, the coefficient was found to be 
54. This outcome may be considered quite good since the 
Chart included elements of skill as well as job knowledge. 

4. Special ratings by foremen for the job knowledge of 
Workers produced rather satisfactory correlations with job 
Information scores. These coefficients ranged from 42 to .66 
and constitute direct evidence of validity. San 

5. Perhaps the most practical and meaningful criterion for 
Validating the job information test is the Civil Service grade 

€signation. Correlations between work level and test scores 
Proved to be consistent and satisfactorily high. These copi- 
Cents ranged from .29 to .62 with a median value of .52. Since 

€ grade designation represents the judgment of Civil a 
Xperts at the time of placement and the later judgment o 


Management if and when promotions were made, the es 
Work level or Civil Service designation constitutes a valuable 


аы of what the worker may be expected to know about his 
Job, 
6. An analysis of the personnel records of workers 1n rela- 


П to their job information test data reveals a definite d 
igher job information scores tende 


nel records. 


М 


| 


AN ATTEMPT TO IMPROVE THE COMPREHENSIVE 
EXAMINATION AT THE MASTERS LEVEL 


MAURICE E. TROYER 


Syracuse University 


ee degrees may be earned in either of two ways in 
chool of Education at Syracuse University—30 semester 
ours of graduate credit including a thesis, or 30 hours followed 
Soe comprehensive and an intensive examination. The inten- 
tion r^ the candidate's field of special study, i.e., Administra- 
ek upervision, Personnel, Social Studies, etc. The compre- 
pro E examination covers the 12 semester hours in the core 
up н required of all Master’s candidates. Four areas make 
aes Philosophy and Educational Sociology, Educa- 
nal Psychology, Measurements and Statistics, and Research. 
„бам describes the comprehensive examination now in 
Bw it was developed, the method of recording and report- 
€ results of individual performance and some conclusions 
at have implications for the educational program at the 
asters level. 
Developing the Test 
For some years the staff member in charge of comprehensive 
Eo lations sent out a call for questions as the oe 3 the 
hai Prehensive examination approached. After e pro a 
the =e in his questions, the 9 an н x wi. Eur 
ora glomeration of objective an жа раа 
The comprehensive examination that wou s Ber 
or Problem of balancing the test was exceedingly 1 M 
Е example, the Educational Psychology pe | 
Ce : Child Psychology, t es- 
Es Psychology, Psychology of Learning 1n es sd ү 
“п, or Psychology of Learning in Secondary Ec понор 
P exceedingly difficult to make up an examination so that a 
235 


exa 


236 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


student who had taken Psychology of Learning in Elementary 
Education would not be penalized by not having had Adoles- 
cent Psychology. Р 

In the fall of 1943 an effort was made to improve the exami- 
nation procedure. The first step was to clarify the objectives 
of the core program to be covered by the comprehensive exami- 
nation. Four were agreed upon by the faculty: (1) Knowledge 
of fact and principles from the literature of professional edu- 
cation. (2) Ability to interpret professional data presented 
either in tabular, graphic or case study form. (3) Ability t? 
make good decisions when faced by professional problems ап 
to give appropriate reasons to substantiate their decisions. С“ 
A tendency to keep up with current professional literature- 

The next step in the procedure was to choose test items 
appropriate for gathering evidence with respect to each of the 
four goals. For the first goal, Knowledge of Fact and Princ 
ples, multiple-choice items of the best answer type were selected- 
For the second goal, items patterned after the Progressive Edu- 
cation Association interpretation of data tests were selecte 
For goal three, items patterned after the PEA application Q 
principles test seemed most promising, and for a tendency e 
keep up with current professional literature, the staff agree 
‘on the use of the matching type of test items. d 

As a safeguard to the balance of the test, the faculty agt&° 
that there should be 25 multiple-choice items from each of t 
four core areas, one interpretation of data problems from ee 
area, one application of principle problem, and 10 matching 
items from each area. In order to assure balance of coverage 
within each of the four divisions of the core area, staff member? 
within each division were to work out their portion of the test 
cooperatively. For example, instructors of the four Educ^ 
tional Psychology courses were to work together in develoP ine 
that portion of the test covering the four objectives for ae 
cational Psychology. 

„The staff member in charge of the examination prepared Я 
guide for the preparation of each type of item. Illustrative test 
items of each type were also included in the guide. This W^ 
an important step, but it did not meet the need fully. Sta 


A 


Ne. 
THE COMPREHENSIVE EXAMINATION 237 


jode узы м мае in the preparation of the various kinds 

tap о ё eem of the test for efficient adminis- 

аы de Е в, ап summarizing difficult. There were also 

iie noit ve азн in the items prepared. For example, in 

dum ple-choice items the correct response was far too fre- 
ntly the longer of the four choices. 


Й S ; 
ample Items, Scoring, and Interpretation of the Results 


ae І, Knowledge of Fact and Principle.—The following 
illustrative of the multiple-choice items. 


A çorelagion coefficient of .65 between two tests indicates: 
7 Satisfactory reliability. 
- That knowledge in one are 
" the other. 
^W Very little relationship. 
- That students who score 
Th tend to score high in the other. 
T first step in conducting a researc 
- Collection of data. 
3 Compilation of a bib 
" Formulation of a working h 
у . Careful formulation of the 
bur of the following statem 
uld probably best represent th 
PSychologists? 
= Delinquency is due t 
- Delinquent behavior 


a contributes to knowledge in 


high in one of the tests also 


h is the: 


liography of similar researches. 

y pothesis. 

problem to be solved. 

ents regarding delinquency 
e viewpoints of present-day 


o some innate deficiency. 
is often an attempt to adjust to 


frustration. . | 
3. Most children in “delinquency areas” in a city become 
å delinquent. ү 
- Low intelligence is frequently a cause of delinquent 
behavior. 


Jefferson’ concept of education for leadership was: 


| » Complete free education for all. 
ducate everyone; select the best; co 
in select; etc. 


of their education; aga! { iti 

3. Choose those who ‘tend to have leadership qualities; edu- 
cate them in separate schools, as well as freely educating 
all who are to be the followers. 


e-choice items 


ntinue the process 


were of the best 
f fact where the choices must obvi- 
eets Were used for Part I. 


Y be right or wrong. Answer sh 
f correct responses. 


Сог 
п я 
& was in terms of the number © 


238 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Part II, Interpretation of Professional Data.—The follow- 
ing problem is illustrative. 
Study carefully the following table and the legend below it. 


The Effect of Allowing Pupils to Inspect their Corrected Test Papers 
(Experiment by Plowman t4 Stroud) 


Difference = 
Mean Mean bi E. Critical 
MESE, Condition N first 1 second 2 pia E Ratio 
es testing testing test 
= 
A І. Corrected tests 125 21.5 252. 
inspected 24.8 
А IL Corrected tests 125 215 200 5.2 21 
Not inspected 
B I. Corrected tests 125 21.3 25.0 
inspected 6 
В П. Corrected tests 125 218 204 46 — 48 25 


Not inspected 


1. The first test was given immediately after the materials wer 
studied. : 
2. The second test was repeated six days later without warning. 5l 
ote: When the *B" materials were studied, the groups were shi em 
hus students who did not have opportunity to inspect ect 
results on the “A” materials did have opportunity to 1nsP 30 
test results on the “B” materials. The tests consisted Я ге. 
multiple choice items. The materials were textbook in ni UE 
ons: Mark the following conclusions 1, 2, 3, 4, or 5 accor 
to the following directions: 
1-if you think the statement is true in the light of the чар" the 
on you think the statement is probably true in the light o 
ata. З 
3-if you think there is insufficient data to mark the items with 
1, 2, 4, T . f the 
dar you think the statement is probably false in the light © 
ata. 
5-if you think the statement is false in the light of the data. 


( ) 1. Within the limits of this experiment opportunity 
to inspect test results is proven to be a justifiable 
procedure. : Я 
Since we have no knowledge of the relative intelli- 
gence of the groups we cannot have confidence that 
the gains are due to opportunity to inspect test 
results. 
( ) 3. Students who inspected their corrected papers 
learned where to place their check mark instea 
of learning the meaning of the materials. 


Directi 


(20101 


x E а. 
a Qr 
ae N у тешне 


va 


М 


"м. 


THE COMPREHENSIVE EXAMINATION 239 


( ) 4. The results of this experiment have great signifi- 
cance for appraisal procedures throughout the 
school program. 

( ) 5. АП of the students in the “I” groups profited by 
opportunity to inspect the results. 

( ) 6. We need to know that the “A” and “B” materials 
were of equal difficulty before we can accept the 
results with confidence. 

) 7. The groups were of about the same average mental 
ability. 

) 8. There was apparently a high correlation between 
students scores on the two types of materials. 

) 9. One could not expect to get similar results if non- 
text materials were used. ; 

) 10. If 1,000 students had participated in each group the 
results would have been about the same. 

) 11. The study is not worth publishing because there 


were only 30 items in each test. 

) 12. Most of the influence of possible differences between 
the groups in intelligence and reading skill was bal- 
anced out by shifting the groups before they studied 


the B materials. 
These items were scored by the deviations method. For 
example, if a student marked a conclusion with a 5 when the 
Sy called for a 1, the score on that item was a minus 4. Acon- 
stant of 35 was pes for each interpretation of data problem. 
€ student's score on the problem is the constant, minus the 
Sum of the deviation of the items from the test key. 


NAR AARON 


Part T] T, Quality of Decision and Reason.—The following 
Problem is illustrative. 


орту Allen is a likeable, 
ular youngster in our schoo? 
Year ands iind in the College Preparatory Mice 
Class he does not seem to be paying much OP, workin 
А Boing оп. Often he spends his time Pane aly gets wu 
uz; . М Ө E 
1 m pr reaime hia oum m on the strength of his ps 
dUmor and sudden spurts of work. Studies do not гора e 
If he gets them, О.К.; if not, that's O.K., too. | In ; е of 
4 the fact that he "never takes a book home. n sp 
ese i i 


а] "my wants to be a 


arently well-adjusted and 
т is now in his sophomore 


ered to think of such thin 
gs. 
lub and is a member of th 


240 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


several other organizations. His Henmon-Nelson Test of 
Abilities score is at 100 percentile for his age group. 
Which of the following is the best attitude or procedure 

to take? 

( ) A. Jimmy is a normal, average youngster and presents 
no particular problem in good educational practice. 

( ) B. A comprehensive program should be worked out co- 
operatively with Jimmy so that he will make fullest 
use of his exceptional abilities. i 

( ) C. This is primarily a problem of misdirected interest. 
Jimmy needs a program designed to interest him 
more in the academic aspects of his school life. 

If you believe a statement below gives sound support to one 

or more of the three decisions listed above, place an X in one 

or more of the appropriate parentheses. 

B С 


А 
ССС) 


4 r 
If you believe a statement to be poor or unsound support fo 
all decisions place an X in the unsound column. 


кү! B С 

CPCI) 
сд» 
(20) 
( ) 
( ) 


1. The school’s primary responsibility 
is to the average child. f 
Bright children will take care 9 
themselves satisfactorily. k 
“All work and no play makes Jac 
a dull boy.” 
“A great mistake in modern educa- 
tion is its waste of genius." — . 
While extra-curricular activities ШЕ 
important it is still true that аС2 
demic achievement is the desire 
goal. 1 is 
()(2()() 6 Exceptional ability in children ! 
often not recognized. | has 
C) 2€)(€) 7. It appears clear that Jimmy f 
more social and athletic ability tha 
А academic ability. р T 
()()¢€)C) 8. The guidance of every child rather 
than the child deviate should be oU 
ultimate aim. . | 
«102071 ) 9. This case appears to be primarily 
a matter of misdirected unbalance 
motivation. “В 
()()( )( ) 10. One of Jimmy’s teachers says 4 
golly, you don’t need to worry abou, 
that boy. He'll get along all right | 
€C) ()() C) П. Gifted ‘children are very likely * 
turn out poorly. 


у УК 
DLF 


ae S N 


THE COMPREHENSIVE EXAMINATION 241 


( ) () ( ) ( ) 12. Most school problems are matters of 
multiple causation. 


ые of 5, 3, or 0 was attached to each of the possible 
ns or courses of action in each problem. In scoring the 
ee response on reason, each of the four possible responses 
^s each item was considered as a true-false situation. Thus, 
т зен had a checkmark in the appropriate parenthesis, it 

as counted correct. If he had no checkmark when no check- 
€ was called for by the key, the response was correct. 
A eckmarks out of place or omitted were counted as errors. 

Constant of 48 was set up for the reasons and to this was 
added the value that the student received for the course of 
action he had chosen. From this sum his errors in reasoning 


Were subtracted. 


А Part IV, Tendency to Keep up with Current Professional 
tterature.—The following items are illustrative. 


€ 3035 Prepared a monograph on A. Lewin, Lippitt 
the use of autobiography and 
diary in psychological re- 
сне d isl . B. Allport, G. 
S 3-5 Experiments with social 
groups under autocratic an 
( democratic leadership. C. Olson, W. 
) 3. Author of important = on 
c lling, psychotherapy . 
out lal. treatment of D. Thorndike, E. L. 
Суд ey behavior. A 
. thor of important book on 
amoto са the educative E. Gesell, Arnold 
DI Develo f normative 
- Development o 
tables d child development, F. Cole, Luella 
author of many books on the 
topic. : А 
С) в. Developed the multiple 
growth (or “split growth ) 
i a 
ens bi at ee H. Rogers, Carl 
zx 7. One of the Lais st à 
i sychol- 
c o ES fe Y note I. Kuhlen, R. G. 
one of the best reviews of the 
psychology of learning. 


G. Jones, Harold 


242 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


( ) 8. Reported a comprehensive J. Stoddard, G. 
study of adolescents by 
means of a case report 
against the background of K. Prescott, D. 
group data collected. 
( ) 9. Wrote a volume recently on 
the meaning of intelligence, L. McConnell, T. R. 
describing results of the 
"Iowa Child Welfare Stud- 
ies.” M. Coghill 
( ) 10. Wrote a recent philosophical 
and psychological book on 
Human Nature. N. McGeoch, John 


an 

In the left-hand column an effort was made to ш 

important recent contribution in professional education. uae 
column to the right listed the names of the persons max! 


tw Ee items 
these contributions. The score is simply the number of item 
properly matched. 


Summarizing and Interpreting the Results 


: . for 

The items were so arranged in the test that наре du 

the four areas of the core and for the four objectives "i ud 
соге were readily obtainable for each candidate. Table 1 sho 


А o 
the raw sub-scores of a candidate as recorded on the back 
the answer sheet. 


ach 
The scores are then recorded on a profile chart for € 


TABLE 1 
Sub-Test Scores for an Individual Candidate* 
a Ed. Phi- 
Ес Measure- | losophy 
Research Psychol. | ment and an 
ory Statistics | Sociology 
Part I—Knowledge of 64 
Fact and Principle .. 16 14 18 16 
Part II—Interpretation 85 
LSD Atanas co. » 30 29 26 
Par II—Quality of | — 18* m 
Decision and mer. Fo 5 4 " 
Part IV—Current Pro- 15 
fessional Literature .. 8 4 1 2 351 
Eod d 87 91 85 38 
* B А retatio® 
No research item was Prepared for interpretation of data because interp"? hjem 
of data items in oth 


rol 
x п ег Core areas were based on research. Instead, an extra Р 
calling for choices of research techniques was placed in Part III. 


THE COMPREHENSIVE EXAMINATION 243 


ань The illustration below is the record of a better- 
average candidate in the Spring 1945. Percentile equiva- 
-10- 
REPORT-MASTERS EXAMINATION 


бю. у, 
Deere Total 
p" Weighted 
Total T-Score 
Raw 1, 2/3 
Bank Score 
A i Wa 15 
8 3 
$ 8 з зю 174 
б em 4 370 167 
m Se oS 5 368 170 
5 2 Ё bs 84 3 С) 6 367 170 
Bak gi Se ü 2 7 366 166 
s 3283 i: = А 9 2. 8 8 365 169 
a 35d jga Ё а Бон 9 365 167 
BORE xls £ 4 8 95 95 2 10 364 168 
оланы mU Hn d a3 $ 3i (S01 Р 
8 iid ibi P dod Pat B 00 101 
3 s 61 
LAE BLAME A RE сс Hu 159 
9o! - 1 15 354 155 
: : 90 16 352 156 
: : 17 352 153 
80; 1 18 350 143 
j : 80 19 346 152 
k : 20 35 151 
: 1 21 34 147 
: 1:70 22 343 153 
ч : 2з 343 150 
60 1 3 2A 242 151 
: : 60 25 342 147 
: M 26 338 149 
ES 1 27 355 м2 
: : 50 28 334 139 
: 1 29 эм 137 
^ E i 20 333 136 
: А Е 1 40 21 329 142 
: : 32 326 135 
d 1 зз 325 127 
: : 30 зА 322 131 
H : 35 322 128 
2i 3 36 321 134 
1 : :20 37 320 135 
: 3 38 319 123 
1! 3 зә 318 130 
110 40 318 130 
plo" me о = сш 41 317 125 
a i 42 316 132 
M 43 316 122 
Score 2, Ae оп-Сопргећолеіче рада ——— 44 315 128 
Faculty ће intensive exasination 775 4b 313 126 
action-Intensive ..———————— — 46 305 114 
Мес, " 47 302 110 
the test ie" scores are for the 52 students who too! в 25 207 
nt of 1944. 4 
spring and summer 50 284 106 
5 2n 89 
52 266 79 


1 


епт, 

E for raw scores were calculated from the performance of the 
Prof]. didates who had taken the examination in 1944. The 
€ shows the relative standing of the candidate on the four 


244 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


objectives, in the four major core areas, and in over-all per- 
formance. 


The total raw score was 351 which placed the candidate 
between 17th and 18th in terms of the 52 who had previously 
taken the test. The combined weighted T-score‘ was 150 which 
ranked the candidate 22nd among fifty-two. The combined 
T-scores correlate about 98 with the raw scores. The T-score, 
however, gives the truer picture and is exceedingly useful in 
considering candidates whose performance is marginal. 


Analysis of the Test 


Table 2 shows some of the characteristics of the present 
examination. The total test has a reasonably high reliability 


TABLE 2 


The Reliability* of the Total Test and Parts of the Test and Correlations 
between Parts of the Test. (N=50) 


I п ш IV 
Part I—Knowledge of Fact and 
ТОШ ака масаи oo (72) 48 E 57 
Part II—Interpretation of Profes- 
sional Data оона. (71) 39 26 
Part III—Quality of Decision and 
" Reasons Жен ы ыар у: End (61) 23 
art —Knowled, Е С t 
Professional Penine - E (.79) (87) 
Total i 


; В "M * cor- 
* Figures in ( ) show reliabilities computed by the split-halves method and 


: T jnter- 
rected to the full length of the test. Other coefficients of correlation show 
relationships. 


(.87) for an unrevised edition. Part IV, Knowledge of Current 
Professional Literature, has a fairly satisfactory reliability, Ме 
other parts are weak, especially Part Ш. Part I is most hig 4 
related with other parts of the test. For the most рагі, ip 
ever, the correlations are sufficiently low as to indicate that t 


Е f 5 0 
various parts are by no means measuring the same type 
achievement. 


Table 3 shows the relationship between the parts of the t^. 
and achievement in graduate courses. A correlation of - 


3 хз score 
1 Т-ѕсогеѕ were determined for each part of the test. In deriving the Tt of 
the mean score is given a value of 50. A T-score of 60 then is the equiva 


T devia- 
plus one sigma and a T-score of 51 represents a score one-tenth of a standard 
tion above the mean. 


THE COMPREHENSIVE EXAMINATION 245 


about as high as could be expected considering the reliability 
of Part I and of grades. The correlation between the remaining 
three parts of the test and scholarship is so low as to raise some 
Serious questions. 
TABLE 3 
Relationship of the Test and its Parts to Graduate Scholarship 


Grade Point 
Ratio* 


* Grade Point Ratio is the number of honor points divided by the number of 


credit hours 


First to be considered is the validity of the comprehensive 
€xamination. All goals covered by the test were approved by 
the faculty. All items in all parts were submitted, revised, and 
E by the professors responsible for core courses. In most 

es the items and key were reviewed by at least three staff 
Members, At present it seems reasonable to place more confi- 

ence in logical or empirical validity than in grades as criteria 


9r the validity of the examination. 
Validity of grades is the next consideration. The group of 
candidates who first took the examination in the Spring of 
геа Were highly frustrated and exceedingly critical aee 
lik ctions to it. They had little previous experience with items 
0 е those in Parts П and Ш, and Part IV. The low coefficients 
. Correlation between scholarship and Parts II, III, and IV 
irt Some basis for their reaction, for the low api e 
ed Icate that students were appraised too exclusively on Rp 
&e of fact and principle in their courses. By the salar о 
= › Students showed a great anxiety for the эрин E 
а amination, The “grape vine" had been operating: + 
»Xlety has been relieved somewhat in subsequent tests throug 


ч [ч a 
м distribution of keyed sample items to candidates some 
з i . B 
5 in advance of the examination. 
е 
Newt Steps in the Development and Use of th 
Examinations | 
А i Education faculty is now 
ИЩ ool of Educa 
| e is e ‘As soon as this has 


Or 
i a a 
ng оп a revision of the core Prost am 


246 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


been accomplished and approved by the Faculty, a new exami- 
nation will be developed. It will probably be built over a pat- 
tern similar to the original. Item analyses of the current edi- 
tion of the test will be made so that appropriate items of known 
value may be used in the revised form for the new core program. 

In addition to its usefulness for general appraisal, the com- 
prehensive examination has proven to be quite sensitive for 
purposes of diagnosis. It might well be administered to candi- 
dates as soon as they have completed their core courses rather 
than at the end of their Master's program. In addition to рг0- 
viding a basis for faculty action, the examination would thus 
enable the adviser to help the student to plan the pe 
of his program in order that he, might strengthen his bac 
ground in the areas of revealed weakness. There are sever? 
other reasons for this recommendation. The core is really con^ 
sidered the foundation for the Master's program for teacher 
but many students have been delaying their enrollment in СОГ 
courses until near the completion of their Masters program Hs 
order to be more freshly prepared for the examination. Then, 
too, teachers’ superintendents, their boards of education, d 
friends know they are in school to complete their Masters D ^ 
gram. The threat of failure on the comprehensive examinati? 
hangs heavily upon these teachers. Taking of the comprehen” 
sive examination midway in their programs would not nece à 
sarily reduce threat of failure, but it would reduce the stig™ 
attached. 

Candidates for the doctorate of Education and the docto Ba 
of Philosophy are required to take a diagnostic examina и 
within 15 semester hours after completion of their work for E «ori 
Masters degree. The Masters comprehensive examinat. 
Serves as an excellent diagnostic instrument for four of er 
Seven areas covered by the Doctor's diagnostic. The 9 i 
three areas, Administration and Organization; Supervision ? 
Curriculum; and Personnel and Guidance will eventually 
covered by similar examinations. 


rate 


Summary 


- 3 :ence іб 
1. A comprehensive examination for Master of pis a 
Education candidates was constructed to gather evidenc 


THE COMPREHENSIVE EXAMINATION 247 


Progress toward four major goals—knowledge of fact and prin- 
ciple in the professional literature; ability to interpret profes- 
sional data; ability to make good decisions and give sound 
reasons for them when confronted by a professional problem; 
and a tendency to keep up with current professional literature. 

2. Reliability of the whole test was good; for parts of the 
test it was fairly satisfactory, considering the complexity of 
Some of the functions measured. 

3. Balance of subject-matter coverage and validity were 
Safeguarded by cooperative staff development of test items. 

4. Analyses of test results in relation to success in graduate 
Courses indicate that grades are based mainly on knowledge of 
fact and principle. The test has diagnostic value at both the 

aster’s and the doctorate level. 

5. Test results for each candidate are scored so as to show 
achievement with respect to the four objectives and the four 


areas of the core. 


A 
STUDY OF PSYCHOLOGICAL REPORTS IN A 
SCHOOL SYSTEM 


EDWIN A. FENSCH 
Mansfield City Schools, Mansfield, Ohio 


As 

tions мы. was recently made of 719 psychological examina- 
Ohio, Cit : 2 the Guidance Department of the Mansfield 
When ies : a to determine what teachers actually wanted 
eretofore Wh ed for psychological. examinations of pupils. 
child, the 2 = teachers Lr involved with a problem 
Bence test of Г садир. was, “I wish you would make an intelli- 
in guidance Y ohnny (or Mary).” As most persons interested 
the teachers now, this request does not truthfully state what 

These "IU coe on such a problem. 
Psychologist fen represent reports made by five different 
Pupils in ‘i rom the period 1934 to 1944, and ranged from 
ot all ан ара e grades to those in the senior high school. 
ogical ш, unfortunately, stated the reason for the psycho- 
State why ын but in many cases the psychologist did 
еве как teacher felt an examination should be made. 
€ table RE were tabulated and divided according to sex. 
It will з es some interesting information. y 
Were cited а noted, first of all, that twice as many boys as girls 
Mterestin or examinations during this ten-year period. Some 
& speculation may arise as to why this could happen. 


„OT one thi 
Hes еар boys тау be more subject to emotional difficul- 
i ing their school period than girls because of the tradi- 

h such difficulties. Boys are 


tion 
al atti 
аьр ne toward boys wit 
that they must be “manly” That is, 
dencies to obtain relief in 


the 
y mu. ©. zd mU Е 
Case of st inhibit their natural ten 
“take it on the chin!” For 


hes them to t 

d demonstration of the tur- 
ds in which it was fashion- 
But today this would 


boys must learn to 


Taj H 
of life for boys because it teac 
Moi] “ao making an outwar 
able easy History records perio 
S inp" to cry in tense moments. 
ng but fashionable. Therefore, 
249 


250 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT А 


bury their difficulties. Secondly, the writer believes that this 
table reflects the fact that the Mansfield Schools have many 
more women teachers than men. Studies in grades awarded 
by women teachers and men teachers to the sexes have shown 
that women teachers tend to give higher grades to girls than to 


Reasons for Requesting Psychological Examinations t 
Bovs Gis i 
Cannot do the work of the class .............. 90 1 
Emotional difficulties ............ si. 22 6 
Has reading difficulties .. 29 8 
Discipline problem .............. 27 6 
Bad home and family conditions .. PS 26 13 
Cannot adjust to school situation ............. 25 11 
Poor physical һеайһ.......................... 24 3 
Broken HOME) que aereas vapore wos 21 12 
Grade placement. «1c tees ке 19 17 
Has pupil superior intelligence? .. и» 16 f 
To enter Sight Saving School ................ 14 10 
Probable defective vision ...... ә 14 2 
Has no initiative ..... 13 0 
Failed „зенан es cae 13 6 
Too frequent absence . 12 5 
БЕЕШЗ ЛИН ero. «nns онаа ода балака 12 4 
Examined because of family’s interest ......... 12 1 
Speech ДЕ О ca cusa изн aee eae Vitis 12 3 
БЗР 11 4 
— Е 10 
Nite te 16 
n 7 4 
— 6 4 
err n n 5 0 
Soc ба 4 1 
iE Ge MA t $ 2 4 
Wants out of Opportunity School ............ 3 0 
РЛЕР». ааа pea Casa, ducta эжеке нә жей» 2 0 
Arithmetic difficulties ...........seeeeeeeeeee 1 0 
ШОШО ГҮ ү. Ko. ае аа Sparco ata 1 0 
Mbreatentd suicide 1.45: 02 cereus an аканта 1 1 
Immigrant to U. S. ... 0 1 
Wants to leave school 1 1 
ТЕБ ПОШ si оаа нше уе а, 0 0 
Mental'abnormality" мшш алшы eo raatsi 1 0 
Tutored student; check-up ................... 1 


boys. The writer believes that women also tend to be ao 
lenient.toward girls with problems of adjustment than mee eet 
on the whole, to boys. Consequently, while girls also аге vail 
in displaying the fact that they are in difficulty, and more ners 
obtain assistance from adults in these matters, women tec on 
tend to understand girls better than boys and to look UP о 
boys as more difficult cases to handle. This тау accourt 


PSYCHOLOGICAL REPORTS 251 


Some extent for the greater number of boys cited for psychologi- 
cal examinations than girls. 

The table next shows that many of the difficulties for which 
teachers requested the aid of a school psychologist were cer- 
tainly not based on the intelligence factor alone. Some of the 
requests, as noted by the psychologists, gave the examiner in- 
adequate bits of information; some were even ambiguous or 

ifficult to analyze. Such reasons as: “Cannot do the work of 
the class,” can mean a number of things. This may mean that 
the individual is a slow learner and finds the work too difficult 
or his ability. On the other hand, one can list without refer- 
Ence to intelligence a variety of reasons from physical conditions 
E home conditions that would make it difficult for the pupil to 
© the work of the class.” Some might argue that this is the 
Work of 4 Pupil personnel specialist; but on the other hand, no 
teacher should attempt to catalog a pupil’s difficulties without 
aving made an investigation himself of the social and economic 
environment in which the pupil lives. With such information 
po hand, it is probable that the teacher could have made a 
Stter Statement of the suggested reasons for a particular 
Pupil’s difficulties, Se 
ince the writer is well acquainted with the manner in which 
Sachers in his system ask for psychological examinations, it is 
necessary to point out that such difficulties as emotional dis- 
Urbances, reading difficulties, discipline problems, maladjust- 
a failure, absence, truancy and others may not be based on 
* Intelligence of the individual. 


few of i isted in the tab 
тел. Of the items listed in th ndica 
ching methods. “Cannot adjust to the school situation, has 


es Itiative n 1 . . Г 
ir i 2 ai ed, too frequent absence, passive, not inter- 
i ool, tru i i i 1 ап ants to 
scl y rithmetic difficu ties, d wants t 
ancy, à s 
› may easily indicate poor educational procedures 
e than a "D š f ese 
380ng f need to investigate a pupil 5 intelligence Th 
Ich e : ley were sound, could more properly come under 
i pro о i school rela- 
ion BS as pr i ds curriculum h l 1 
S. blems in meth Й Я 


h © of the 6 А m II : n for the services of a 
i a 
E cor easons given actually 


А i ici «Defective hearing, 
ер: "ап Instead of a psychometrician. = ee 
Ptic, probable defective vision, poor PHYSIC , 


le may even indicate poor 


252 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


vous breakdown, and mental abnormality,” should have been 
referred to a physician or a psychiatrist, rather than to 4 
psychometrician. Similarly, pupils with speech defects, non" 
readers, and the like need specialists in these particular fields, 
not a specialist in intelligence testing. 

It is plain from this study that what such a school system 
needs is a request form, directed to a Guidance Department; 
such as has been developed for the Mansfield Schools since 
this investigation was completed. Such a form would permit 
teachers to make requests for examinations for a variety 0 
reasons. These requests could then be turned over to 2 specia" 
ist to whose field such difficulties apply. It is not only шеше 
lous to ask a psychologist to deal with pupils with defectlv 
hearing or eyesight; it is also a waste of time, or perhaps : 
dangerous procedure. The use of a general request form may 
avoid such errors. i © 

It naturally follows that schools, becoming more aware se 
the need for the wider aspects of guidance, are beginning t° ; 
the advantages of the full-time services of a physician. has 
mainly on the family to remedy a pupil’s physical defects 


ts 
not been too successful. The fact that teachers noted ym 
show that the family probably did not know about the d the 


or did nothing about it. If the family had known it an hy- 
plan of relying on the family’s cooperation with their own e 
sician had worked, the defect might have been elimi’ i 
before it became a school problem. For the sake of the Tens 
and the welfare of society, these matters often become prO 
which the school must handle. 

Finally, the time has come when teachers mus 
understand and to use the principles of guidance; 
leave the narrow sphere of teaching only subject m 
look upon the whole child. Teachers need to understar 
dren, what they do and why they do things much mor? j 
they need to know the principles of English, Mathe ing 
Geography, or whichever subject they happen to be toar ex) 
Until this wider understanding is prevalent among im est? 
psychologists and specialists will continue to recen il соп“ 
such as were listed in this table, and problem children W! 
tinue to be cited for examinations in large numbers. 


t begin E 
they mu, 
atter ? J 


THE SHIPLEY-HARTFORD SCALE AS AN 
INDEPENDENT MEASURE OF 
MENTAL ABILITY* 


COMMANDER ROBERT J. LEWINSKI, H(S) 
United States Naval Reserve 
ы ee psychological examinations are conducted to 
ни ^ only reliable estimates of native intellectual endow- 
ality oa also to provide insight into the individual’s person- 
objecti ructure through the application of well-standardized, 
be “aie tests. In clinical practice, the tendency appears to 
зня ard the detection of psychopathology by means of these 
5, thus facilitating differential psychiatric diagnosis. Ex- 


amples include the Rorschach, Thematic Apperception, and 
hich are widely used to aid, 


chiatric appraisal. Other 
rtford, and Hunt-Min- 
tellectual deterioration 
h organic conditions as arterio- 
etc. 
d following psychological exami- 
employed have not yielded 
5 the statistical standardi- 
Su The examiner may then discard 
"is data as useless or attempt to USC it for a purpose other 
ue that for which the test was originally intended, as, for 

io "ple, the use of a negative Rorschach record in the estima- 

| П of intellectual development: | . 
per is to indicate the possible 

he Shipley-Hartford 

f intellectual 


[ Use T purpose of the present P2 
etr data derived from one suc 
е B 
Bs Scale (4, 5), as an independent measure o 
hose of the writer 


tained in this paper are tl 
pem reflecting the views of the Navy Depart- 


h test, t 


1 Th 
a е opini r 
nd are ee and assertions CO 
to be construed as officia 


t e 
€ naval service at large. v 
5 


254 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


status in instances where the scores obtained on this test were 
negative in indicating intellectual deterioration, the purpose 
for which the scale was principally devised. 

The Shipley-Hartford Scale, described thoroughly in the 
two references cited above, was constructed to detect intellec- 
tual impairment and deterioration, and is “based on the clinico- 
experimental observations that in mental deterioration vocabu- 
lary level tends to be affected but slightly, while the ability 10 
see abstract relationships declines rapidly” (4, p. 371). The 
scale is composed of two parts, a test of abstract thinking, and 
a multiple-choice vocabulary test. The abstract thinking test 
comprises 20 items with a ten-minute time limit, and provides 
an abstraction age derived from norms obtained through stand- 
ardization on 1046 normal individuals for whom intelligence 
test scores were available. The vocabulary test, which is given 
first, consists of 40 items, it has a ten-minute time limit, an‘ 
it yields a vocabulary age likewise derived from the standardi- 
zation procedure mentioned above. A total mental age, 19 
obtained by combining both parts of the test. Respective relia- 
bility coefficients are reported as .89, .87, and .92. It is hel 

‘that the last of these coefficients “virtually represents the scales 
reliability when used as a measure of intelligence” (4, P- 376). 
These mental age norms were determined from scores obtained 
by the standardization group on a variety of group intelligence 
tests; however, it was felt that no constant error was introduce 
by this procedure. It is important to note the fact that the 
standardization was based on group tests presumably of the 
paper-and-pencil variety in evaluating the data to follow: al 

The principal score obtained from the scale is the concept 
quotient, orCQ. This quotient is essentially the result of йу” 
ing the abstraction age by the vocabulary age, although actually 
it is obtained by a more complex formula. The conceptu? 
quotient represents the degree of intellectual impairment or 
deterioration and is thus significant in determining poss), 
deviations from original mental level. Degrees of deterioratio? 
represented by the conceptual quotients are as follows: Abs 
90, normal; 85-90, slightly suspicious; 80-85, moderately $95 


= Ф 


e» 


2 
br The s s à 
УЧ langue is admittedly ineffective à i deteriorated to the de| 


THE SHIPLEY-HARTFORD SCALE 255 


opi 75-80, quite suspicious; 70—75, very suspicious; below 
» probably pathological.” 
nae pin pee used in this research were 100 white males, 
boudin tor psychological examination in conjunction with 
ЖЕ пс observation. All represented relatively benign psy- 
sn Ic disturbances, including such entities as incipient psy- 
оет mild fatigue states, migraine headache, situational 
adjustment, etc. No psychotics were included in the group. 
a Patient was examined with a routine battery of psycho- 
^a ric tests which included the Shipley-Hartford Scale and the 
mplete Wechsler-Bellevue Adult Intelligence Scale (8). 
hrs € age range of the subjects was from 17 to 38 with a mean 
vi nological age of 23.7 years. Educational attainment ranged 
E the 7th grade to graduation from college. The mean 
ool grade completed was 11.5. А . 
m ё psychometric data were analyzed with a view toward 
‘a Overing relationships among the three scores from the Ship- 
, test and the verbal, performance, and full scale Bellevue 
i: Since the Bellevue scales do not yield mental ages, the 
fe os abstraction, and total mental ages obtained from 
R ipley test were compared with intelligence quotients. 
r8ardless of this fact, it is believed that the procedure will 
of (шашу represent the relationships among the various parts 
€ two tests, A 
М һе Shipley vocabulary ages ranged from 11.5 to 20.6, with 
ve Vocabulary age of 16.2. The standard deviation of the 
на bution was 2.0, The mean abstraction age was 16.3 with 
апре of from 11.5 to 20.5 and a standard deviation of 2.0. 
menta] ages ranged from 11.5 to 20.2, with a mean and 
ard deviation of 16.5 and 1.9 respectively. It is apparere 


a 

no а marked relationship exists among the three sets of dre 
9nly insofar as range of scores is concerned, but in Taal 
asures of central tendency and variability as well. 

ing; Ptual quotients were within normal limits, пв 

ожа z i nt or = 

Чо ша, the existence of mental impairme 


his finding was eventually substantiated by clinical 
оп. 


e with mental defectives, рео хін 
i - i gree tha 
“abularie чаве difficulties, and individuals 


аг 
les are affected. 


256 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


In regard to data derived from the Bellevue scales, full scale ` 


IQ's ranged from 92 to 137, with a mean of 114.31 and standard 


deviation of 9.76. Verbal scale IQ's ranged from 87 to 137, the 
mean being 113.05 and the standard deviation 11.57. The mean‘ . 


performance scale IQ was 112.69. The range was from 92 to 
130 and the standard deviation of the distribution was 9.21. 
The Bellevue scales indicate that the present group is above 
normal when compared with the distribution of intelligence ! 
the general population. 

In Table 1 are found the intercorrelations and standard 
errors among the scores obtained on the Shipley test and the 


verbal, performance, and full scales of the Bellevue test- It will 


TABLE 1 


Intercorrelations and Standard Errors Among the Shipley Abstraction, Vocabulary 
and Total Mental Ages and the Bellevue Verbal, Performance, 
and Full Scale IQ's 


Variables d PE 
7 
Full Scale IQ's x Vocabulary Age .................... 577 06 
Full Scale IQ's x Abstraction Age 609 067 
Full Scale 1Q’s х Total Mental Age . 653 1060 
Verbal Scale 107 x Vocabulary Age .. 635 ‘059 
Verbal Scale IQ's x Abstraction Age .. 1053 
Verbal Scale IQ's x Total Mental Age .. 689 1087 
Performance Scale IQ's x Vocabulary Age . 361 .083 
Performance Scale ÍQ's х Abstraction Age 414 083 
Performance Scale IQ's x Total Mental Age . 417 ‚ 
é : s 1p" 
be noted that the highest coefficient (.689) is between the et 


ley total mental age and the Bellevue verbal scale, and chat 
lowest (.364) is found when the Shipley vocabulary age De ley 
lated with the Bellevue performance scale. All three Ship 1 
scores correlate most highly with the Bellevue verbal scale p" 
lowest with the performance scale. Conversely, all Belle 
scales correlate most highly with the Shipley total menta! ? 
and lowest with the Shipley vocabulary age. pol- 

Tests of vocabulary have the distinction in clinical psy n 
ogy of being fairly valid indicators of general intelligence x re 
employed independently. It is therefore of interest to C077 ipe 
the findings of this study with previous investigations 9 M 
relation of vocabulary scores to more complex measures ? þe- 
eral intelligence. Terman (6) reports a correlation of - 


Tn 


THE SHIPLEY-HARTFORD SCALE 257 


t: 
Ween vocabulary and mental age on the 1916 Revision of the 


. Binet, while Mahan and Witmer (3) found a coefficient of .87 


меу zy two variables on the same test. Terman and 
Son of (7) obtained an average coefficient of .81 upon correla- 
tion wot and mental age on the 1937 Stanford Revi- 
й echsler (8) considers vocabulary to be an excellent 
of 85 ^ of general intelligence and reports a coefficient (eta) 
Wachs] etween the vocabulary subtest and the full scale of the 
theva oe test. Thus, the highest relationship between 
oe ca ulary test of the Shipley scale and any scale of the 
ue test is lower than those cited above as existing between 

sider [ani neral intelligence. In con- 
ering this discrepancy, allowance should be made for the fact 

d, vocabulary scores were ob- 
d, while the Shipley test 


test are not too sur- 
bly sampled by these 
h instance lower than the 


ance ia Sni previously demonst 
Mentions . and other ee 
N's and nie oe d z 2d Il sca 
tively, I eo taine ont etu 
ound n an investigation yet unpu 
Bell Tespective coefficients of .91, - 
€vue performance scale and the full scale, ver 


Vocabulary subtest. | | 
With he Shipley total mental age correlates consistently higher 
all Bellevue scales when compared with the vocabulary 
abstraction ages. It may therefore be concluded that the 
al mental age will represent best an individual’s mental level 
€n the Shipley scale is employed as an independent measure 
intelligence. Since the coefficient of .689, existing between 
6 Shipley total mental age and the Bellevue verbal scale, 
= “sents the highest degree of correlation between the two 
S, and since this coefficient in itself cannot be co 


nsidered 
Te ү х 
markably high, it is obvious that caution must be exercise 


258 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


in any such interpretation. Nevertheless, it is noteworthy that 
this coefficient is slightly higher than those reported to exist 
between the Bellevue verbal scale and Scales A and B of the 
Herring Revision of the Binet-Simon Tests (1, 2), which are in 
themselves complete measures of intelligence. 

The Shipley abstraction age occupies a middle position inso- 
far as its correlation with the Bellevue scales is concerned, and 
since this test is designed to measure a specific function (con- 
ceptual thinking), there is no reason to assume that it should 
be highly related to general intelligence. It is surprising, how- 
ever, that this part of the Shipley test correlates more highly 
with the the Bellevue scales than does the vocabulary test, 
which, as pointed out above, samples a function shown 10 
previous investigations to be a fairly reliable indicator of men- 
tal level. 

The use of the Shipley scale as an index of intellectual level 
is subject to the same limitations as are group tests of intelli- 
gence generally. The most serious drawback is, of course, the 
fact that the subject’s motivation cannot be determined, an 
if low, directed or controlled. On the other hand, minor reading 
defects should not affect the scores on the Shipley test to the 
degree that they do those of group tests of intelligence, since 1” 
the Shipley test the reading of meaningful sentences (except 
in the directions) is unessential. In conclusion, it should be 
stressed that the absence of pronounced correlation with the 
Bellevue scales does not detract from the test's value as а 
index of deterioration or impairment, which admittedly 1s ч 
primary purpose. 

Summary 


The performance of 100 white males, ranging in age пош 
17 to 38, was compared as regards their function on the Shipley- 
Hartford Retreat Scale and Wechsler-Bellevue Adult Intell- 
gence Scale, with a view toward determining the significant 
of the Shipley test when used as an independent measure i 
intelligence. The highest coefficient of correlation was foU? 
between the Shipley total mental age and the Bellevue verba 
scale, and the lowest between the Shipley vocabulary age 7” 
the Bellevue performance scale. The three Shipley scores 


7 
1 


THE SHIPLEY-HARTFORD SCALE 259 


Correlate most highly with the Bellevue verbal scale, and all 
Bellevue scales correlate most highly with the Shipley total 
mental age. In view of this, it is concluded that the Shipley 
total mental age will represent best the individual’s mental level 
if used independently for that purpose. The lack of remark- 
ably high correlation between the Shipley scale and the Belle- 
Vue test does not detract in any way from the validity of the 
ormer as an index of deterioration. 


REFERENCES 


1. Lewinski, R. J. “Experiences with the Herring Revision of 
the Binet-Simon Tests in the Examination of Subnormal 
Naval Recruits,” American Journal of Mental Deficiency, 
Tac: XLVIII (1943), 157-161. . А ‘i 

© “ewinski, R. J. “Further Experiences with the Herring Revision 
of the Binet in Examining Naval Recruits. American 

3 Journal of Orthopsychiatry, XIV (1944), 396-399. 
1 Mahan, H. C. and Witmer, Louise. “A Note on the Stanford- 
Binet Vocabulary Test." Journal of Applied Psychology, 


‚ XX 2263. - 
| Shipley, ‚лы Self Administering Scale for Measuring In- 
tellectual Impairment and puce Journal of 
Б 371-377. | 
x Shipley 3095085» Ene, С. “А Convenient Self-Ad- 
ministering Scale for Measuring Intellectual Irnpairment 
in Psychotics.” American Journal of Psychiatry, XC 


1 E ‘ j 
6. asd <a pe Vocabulary Test as a Measure of Intelli 
gence.” 2 Journal of Educational Psychology, IX ( 2b 

4 + . 
7. lm rm and Merrill, Maud A. Measuring Intelligence. 
Boston: Ї -Mifin, 1937. a 
E Wechsler, D. ү ЖКА of Adult Intelligence. Balti 


more: Williams and Wilkins, 1944. 


UNI 
сасы OF MICHIGAN NORMS FOR THE 
D STATES ARMED FORCES INSTI- 
TUTE TESTS OF GENERAL EDU- 
CATIONAL DEVELOPMENT 


WILMA T. DONAHUE 

Co University of Michigan 
steam and universities have the task of determining the 
Potential xd of thousands of education-bound G.I.'s. These 
of a Pad Mie do not present typical admissions problems 
, тни They are already twenty some odd years 
etermined е у a desire to make up for the lost war years, 
o get an education, and demanding an opportunity 


Ота 
п, аке advantage of the educati isi i 

n’s ge of the e ucational provisions of the Service- 
many of these individuals 


had 
Colle o had not taken 


Possi 
The, sible the academic potentiality o 
usual criteria of high 


o 

Tec * 

. Técords and teachers’ estimates, 
objective psychological 


f 


d Forces Institute are 
olas: dditional evidence of 
Widely 16 Promise. This battery of tests was administered 

Also, many admis- 


261 


262 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 1 


United States Armed Forces Institute Tests of General Educational pet 
Percentile Norms by Tests for 1,314 University of Michigan Freshme: 


Raw Expres- Social Natural Raw Expres- Social Natural 


Score sion Studies Sciences Score sion Studies Sciences 
107-112 99 56 1 62 E 
105-106 98 55 1 60 80 

104 97 54 0 57 E 

103 96 53 0 53 2. 

102 95 52 0 49 A 

101 94 51 0 46 p 

100 93 50 0 42 68 

99 92 49 0 38 а 
98 90 48 0 35 E: 
97 89 47 32 2А 
96 87 46 30 23 
95 85 45 27 i 
94 83 44 25 46 
93 81 43 20 "n 
92 78 42 18 59 
91 76 41 16 37 
90 73 40 13 a 
89 70 39 11 30 
88 68 38 9. 27 
87 64 37 7 24 
86 61 36 6 22 
85 57 35 5 19 
84 53 99 34 5 17 
83 50 99 33 ; 14 
82 47 99 32 12 
81 43 99 31 2 11 
80 39 99 99 30 1 9 
79 35 99 99 29 1 8 
78 32 99 99 28 0 7 
77 29 99 99 27 0 ү 
76 26 99 99 26 0 z 
75 24 98 99 25 0 1 
74 21 98 99 24 0 4 
73 19 97 99 23 0 3 
72 17 97 99 22 2 
71 16 95 98 21 2 
70 14 94 98 20 1 
69 13 92 98 19 1 
68 11 91 98 18 1 
67 10 89 97 17 1 
66 8 87 97 16 0 
65 7 86 95 15 0 
64 7 84 95 14 0 
63 6 82 94 13 0 
A 5 79 92 12 0 
5 4 77 90 11 0 
E 4 74 89 10 0 
Be 3 71 88 9 0 
2: 2 68 86 8 

57 1 65 84 


a Se, 


NORMS FOR ARMED FORCES INSTITUTE TESTS 263 


еу sme? үе to men they take the battery 
E ote m the usd iN ye E. different tests, con- 
battery. There T a e" principle, are included in the 
Pression; (2) I i - с me Ееее of Ex- 
к. бима 3 (чөн ber of Reading Materials in the Social 
ral Siehe н nterpretation of Reading Materials in the Natu- 
ТРИ mele (4) Interpretation of Literary Materials. These 
De cores Ке - €—M n as selection 
и sal Crawfor and Burnham (1) ound in their study 
d atively small group of Yale students that the G.E.D. 
саш with first-semester grades as well as the College 
result ce Examination Board Tests. On the basis of these 
i they established an upper level critical score above 
usual applicants are admitted although they may lack the 
ntrance requirements. 
be United States Armed Forces Institute en has pub- 
to tative college norms for different types o institutions 
Iecommends that local norms be established also. The 
egistrar’s Office of the University of Michigan requested the 
En of Psychological Services to include these tests in the 
in ar Orientation Week freshman examination program. As 
€ was not available for more than three of the tests, 1t Was 
€cided to omit the test on Interpretation of Literary Materials. 
fts he three tests were administered to 1,314 entering fresh- 
at different testing periods within a period of one week. 
Ep was made up of both men and women but the iR 
nat; Ominated. The results of the tests indicate that x e 
in lonal norms even for Type І institutions are somewhat ow 
Comparison to the University grouP- For this reason it 
ish d seem to be of value to present the normative ne MET 
ed at the University of Michigan as a guide to other insti- 


tio: haie 
Ns of a similar nature. е 1 
able 1 presents the raw scores and the percentile equiva- 
for each of the three tests. 


REFERENCES IES 
«Trial at Yale University 


Crawfo 

rd, A. B. and Burnham, P. S. Yale Tae 

of th Institute General Educationa 

vache Armed кә Educational and Psychological Mea- 
surement, IV. (1944), 261-270. 


jer. 


264 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


2. Lindquist, E. Е. “The Use of Tests in the Accreditation of Mili- 
tary Experience and in the Educational Placement of War 
Veterans." Address to the National Association of State 
Universities, Chicago, 1944. 

3. The United States Armed Forces Institute Tests of General Bor 
cational Development, Examiners Manual (College Leve ). 
American Council on Education, 1944. 


A STUDY OF THE VALIDITY OF THE ARMED 
FORCES INSTITUTE TESTS OF GENERAL 
EDUCATIONAL DEVELOPMENT IN 
THE FIELD OF SOCIAL STUDIES 


MARY EDITH BRADLEY 
Illinois State Civil Service Commission 
авы idily inc ing number of veterans returning 
баз i institutions all over the country, attention 1S 
Which — to the Tests of General Educational Development, 
Fores пант made available by the United States Armed 
Purpos aii In an effort to establish local norms for the 
би ен of granting academic credit on the basis of achievement 
ia se tests, MacMurray College administered one of these 
т 100 of its students. 
he test in Interpretation 


тн a steadily increas 


of Reading Materials in the 


or ial Studies (Civilian Form) was administered in March 
Siem 5. The college was particularly interested in seeing 
who ү this test would discriminate between those students 
grad ad more academic hours and who had earned the better 

€s in social studies from those who had lower grades or 


ewe А к F Я 
г academic hours in social studies. 
ith other colleges and univer- 


w d woman in military service 

i } ств : 

i ave had some form of training which is of potential value 
Measurement 1s difficult 


in a h; 
üt high school or college program- : 
Decessary if veterans are to be placed in appropriate courses 


о 
of jd now that the war is over. They deserve the granting 
Чеш cient academic credit to place them in the college cur- 
eins. at a level consistent with their interests and abilities. 
truces this problem, the American Council of Education con- 
ed these measures. Now the educational development of 

© Veteran may be so measured that he will neither be unfairly 

265 ‘ 


266 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


handicapped because of the nature of his early training, nor 
penalized because of his lack of recent classroom experience. 

As is pointed out in the Examiners Manual, the college 
level tests, devised by E. F. Lindquist, are used primarily To 
determine whether the individual is as capable of carrying 0n 
advanced college work as the student who has taken certain 
broad introductory or survey courses. 

The sample on which this review is based consisted of 100 
students at MacMurray College for Women, divided among 
the four classes as follows: Freshmen, 12; Sophomores, 55 
Juniors, 22; Seniors, 11. All subjects were enlisted from em 
then-present enrollment in three of the classes being offere 
in the field of social studies, Principles of Economics, Principles 
of Sociology, and Economic and Political History of the Unite 
States, 1492 to the Present. 

The test was administered on a completely voluntary b 
under work-limit conditions, thus placing the emphasis A 
power rather than speed. This procedure was followed ieee 
it undoubtedly will be found to be more satisfactory ne 
with returning servicemen and women, who, because of t ii 
lack of recent academic experience and relative unfamiliarity 
with objective testing techniques, might be unfairly penal 
by uniform and relatively short time limits. It was found tha 
a period of 120 minutes per test was adequate for ae ЧН 
persons, and that the majority finished in 90 minutes. г 
test was given under optimum testing conditions. In un. 
the fact that it was offered on a volunteer basis, it is proba a 
that cooperation and effort were genuine and that the res" 
represent true ability, barring uncontrollable factors. «ide 

Background information was secured about each pu 
pant, covering each course she had taken in the field of ct 
studies and the corresponding grade she had received: ee 
grade-point average of each girl was then computed for 2 
total amount of social studies so far in college. The results 2” 
given in the following paragraphs: ré 

1. Validity for Total Group.—A scatter diagram was P e 
pared, correlating the total scatter of 100 scores with the gt A 
point averages in social studies of those 100 girls. A correlati? 


asis, 


[4 


VALIDITY OF ARMED FORCES INSTITUTE TESTS 267 


of .66 with a Р.Е. of .038 was found. The correlation chart 
revealed that the most discriminating critical score on the test 
Would be set at 63 out of a possible 91, at which point only one 
of the 65 scores of 63 or above is accompanied by a grade- 
Point average in social studies below a “С.” This may be inter- 
Preted for local purposes as meaning that a score of 63 on this 
test probably predicts satisfactory achievement at MacMurray 
College in the field of social studies. 

. 2. Validity within Sub-Groups.—The scores were divided 
Into sub-groups according to the number of hours of social 
Studies the girls had completed. Three divisions of the scores 
resulted, grouped according to: (1) those having completed 

hours of social studies; (2) those having completed 5— 
Ours of social studies; and (3) those having completed 10 or 
More hours of social studies. When grade-point averages of 
those girls having 0—4 hours in social studies were correlated 
With their scores on the test, a correlation of .64 was obtained, 
the number of cases, however, being limited to 32. At the other 
end, on 52 students having 10 or more hours of social studies, 
When comparing their grade-point averages with scores on the 
test, a correlation of .67 was obtained. Reference to grade- 
Pont averages in each case concerns the grade-point average 
97 each student only in studies in the social studies. Because 
PF the fact that the number of cases in the middle group (5-9 
Ours of study) was so few, and the scatter did not vary appre- 
Slably from the two extreme groups, no separate correlation 
With Brade-point averages was determined. 
he fact that the differences among the 
that Were obtained, .64 and .67 on the low and high sub-groups 
and the ғ of .66 for the entire group of 100, are negligible, indi- 
sates that performance on this test is not greatly affected by 
© number of hours a student has had in the field of social 
Studies at MacMurray College. The medians of the groups 
Wided according to the number of hours of study are all 
Within the limits of three standard scores. The median and 
Ччагше deviations, likewise, of the entire group both fall within 
€ same limits as those of these sub-groups: ^ 
9. Comparison of Medians by College Class.—Comparison 


three correlations 


268 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


of medians of the distributions of scores grouped into the four 
academic classes, Freshman through Senior, reveals a maximum 
difference of seven standard scores occurring between the 
Sophomore group and the Senior group. The other two 
medians fall within this range. When the statistical test O 
significance was applied to determine the reliability of the dif- 
ference between the various medians, it was found that only 
the difference between the Sophomore and Senior groups, whose 


D 
PE, Y3 5.86, was significant. Using the Sophomore group» 
which had the largest N (55) as a basis of comparison, the 


D 
PE, was only 1.88 when the comparison was with the Freshman 


group, and 2.62 when the comparison was with the Junior 
group. All other comparisons yielded small critical ratios. 
may be concluded that although Junior and Senior medians 
are somewhat higher than those of the two under classes, ПО 
consistent significant relation is revealed. 

This study revealed that scores on this test do correlate hs 
a significant degree with grade-point averages, but are not SIE" 
nificantly related to the number of hours of study the testees 
have had in the field of social studies, nor to grade placement 
within a range, at least, of four years. Although the number 
of cases involved is small and the conclusions are highl 
tentative, the results are reported at this time because of the 
general need for data concerning these widely-used tests. 


ANO 
TE ON THE DIAGNOSIS AND TREATMENT OF 
SCHOLASTIC DIFFICULTIES 


KARL P. ZERFOSS 
Du George Williams College 

ating "odes period when the Navy V-12 program was oper- 
diagnosis : T Williams College a rather new approach to the 
Out. As bns treatment of scholastic difficulties was worked 
Were абына кө me in all such units, scholastic deficiencies 
made to deal to the Educational Office. At first, efforts were 

with these students through the faculty counse- 


Ors t 
o wh В 
Was suet co the men were assigned. Later each department 
to attempt the diagnosis and treatment of its own 
The fol- 


Stude 
nts Н В 
Owing s were not making satisfactory progress. 

plan was devised to enlist the aid of the departments 


and 
to assi . 5 г 
я ат the instructors in this work. 
catio olastically delinquent students were reported to the 
nal Office their names were entered on a form and the 


epart h 
ment in which their delinquency fell was indicated. 
ads, who then called their 


discuss the students in- 


Sta 
S to 
d gether for a clinic session to 
in the conference, not 


f the difficulty but also as to sug- 
for carrying it out. The instruc- 
which of the failing group 


Only A 
as 
Beste a: supposed causes О 
to Teatment and means 


Ts 

We ? 

Seeme Р requested to estimate | 

Quate Opelessly deficient, indicated on the form as inade- 
ered favorable prognosis for 


Prove olumn 3), and which offi 
ment (Column 5). In eac 


о à: $ 
ы indicated (Column 4 © is ргосес 
аЬ to assist the staff members in thinking differently 


Wt oe H . 
"i ih status and future possibilities. Finally, the in- 
Че т to specify just what was to be done in view of 
—~2S¢d-upon diagnosis (Column 7). 


1 
À co 
PY of the form used in this connection appears at the end of this art 
269 


h case “Basis of Judgment" 
r6). This procedure was 


icle. 


270 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


These forms, when completed, were returned to the Educa- 
tional Office, where they were studied to ascertain what addi- 
tional remedial steps were needed. For example, among the 
measures suggested were the setting up of coach classes by de- 
partments and special individual consultation, details of which 
needed to be carried out by the Educational Office. Я 

This procedure had some advantages over the purely indi- 
vidual approach. In the first place, it brought together severa 
instructors, who pooled information and insight about the stu- 
dents concerned. The process made all instructors more con^ 
scious of the necessity for analyzing causation and of planning 
appropriate treatment. It doubtless also produced a tendency 
toward more adequate individualization of instruction, and, 0 
course, enabled each department to see more clearly the fruits 
of its teaching and to determine where methods, emphasis, a" 
content should be continued and where modifications might be 
made. ; 

The form carried opposite each name a list of courses n 
other departments where the student also was having difficulty 
(Column 2). This enabled a department to look at the 5107 
dents more as a whole rather than from the angle of one course 
alone. 

The study of several of these reports indicated to the depart" 
ment and to the Educational Office just where special attention 
was needed, as frequently the same students were listed from 
time to time. It also showed the frequency of assigned cause 
for failure and the usual methods depended upon for treatmen" 

This process of diagnosis and treatment seems to offer e 
sound approach. Some of the reasons for this conclusion nM 
been outlined above but perhaps the major one is that 1t ре 
the responsibility squarely in the hands of the instructo 
where it belongs. However, in certain cases, group techniq”? 
(such as the coach class) and special attention by the Person” 
Officer were necessarily introduced as supplementary efforts 1d 

There is every reason to believe that the same process wou e 
be effective in civilian institutions. In the V-12 Unit at 010, 
Williams such fields as English, Mathematics, Technical Dr e 
ing, and Physics required several instructors in each, whie 


SCHOLASTIC DIFFICULTIES 271 


E E pue 2 bring together faculty members in related 
fered ^ 1 oe groups for the study of the students 
be л n civilian institutions the same general plan would 
E Vue even if the number of teachers in a specific depart- 
и es so great. This could be done by grouping the 
jc T of closely related if not identical subjects or fields, 
ем ysical Science and Social Science. At George Wil- 
Physical м аге able to use groupings from Junior College, 
Hon will du and Group Work. Of course, each institu- 
Much у ua to modify this method to suit its own situation. 
ume ki should be done upon the enlargement and refine- 
CRN the “bases of judgment and of remedial measures” 
TEA py of the form). | It is hoped that there will be further 
3 (SO pan qe with this technique which is described here in 
ial and undeveloped stage. 
DEPARTMENTAL REPORT 
STUDENTS SCHOLASTICALLY DEFICIENT 

Department ——————— — — — 

===: сазва MK LULA eee 
Basis 


Basis || Poten- " 
Course} Also |zade-| of | tially | of eue 


Student in | Below Jud Ade- | Judg- 
Dept. | In KON dese е &-| Taken 
quate | ment 
eese. @ | @ zl ow CS eo 


Date le 


KEY FOR USE IN FILLING OUT 
DEPARTMENTAL REPORT 


Basis 

of Judgment (Columns 4 and 6) 
* Grades. 

- Test Scores. 

нге motivation. 

E ee 


tne OO NU IS 


272 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


. Emotional responses. А . 
- Consensus—factors hard to identify. 


6. 
7. 
8. 
a a RR 
10. 


se 


Remedial Measures Taken (Column 7) 


10. 
11. 
12. 


ЗО сора ge tS 


After class talk. 
ce conference. 
Special resources suggested. 
Private tutoring recommended. 
Coach class arranged and student invited. 
Study skills class recommended. 
Aided during class or lab. session. 
Referred to Counseling Committee. 
Let nature take its course. 


—- aasaaaasħaassħÃĂ 
ee 


A QUICK METHOD FOR MULTIPLE R 
AND PARTIAL г 


WILLIAM LEROY JENKINS 
3t Lehigh University 
HE ; : 
ecd peu R for a 3-variable problem can be obtained 
y from charts computed from the formula: 
Tar — Zrasfoslab + Ts 
Таш = 1 2 

The multi x и 
mined b u tiple R for 4, 5, 6, or more variables can be deter- 
Sands. setting up the problem as a progressive series of 
the cha e multiples, each of which can be secured directly from 
steps: rts. Thus the multiple Ёо» can be worked out in three 

(1) Ra; from Tox fi» and Ta 
(2) Rare from Taos Toos and Tab 

A 5-variab (3) Rae from Tavs» Tex and Tare 

at iable multiple requires 6 steps and a 6-variable multiple, 

eps. 

Parti Р 

each T е: can be obtained by determini 
ividual vari i i 

ormula: ariable omitted in turn, 


ng multiple R’s with 
and then using the 


Partial r-1- de Ron variables 
To Obta; T — Rai vartadies except the one bei a 
tain the multiple R and all of the partial 7’s in a 5-vari- 


© problem requires 13 steps plus 4 computations of the 
26 steps and 5 computa- 


m 
moe In a 6-variable problem, 
“Va are necessary. The Work Sheet is set UP for a complete 
e св problem. If the multiple Ё alone is wanted, only 
rst 10 steps need to be carried out. 


Procedure 
sby interpolating in Table 1. Enter 
p of the Work Sheet. 


ng partialled 


fos 


1, 
the Convert all 7’s to Ё’ 


values in the matrix at the to 
273 


274 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


2. Find the multiple E,» as follows: Select the chart e 
Figures П through VIII for the primary value next below Eer 
Find E,» on the secondary scale. Move horizontai oe 
interpolating between the curves for Ey, Read T A 1 
downward to the scale for Added Е. Plot this value of S s 
E on the interpolation chart (Figure I) opposite the ver 
primary. Select the chart for the primary value next 4 
Ess and repeat the same process. ран 

"3. Drew a line Бант the two plotted points оп the ww 
polation chart. Find the point where the primary Eos * Е. 
sects this line and read off the corresponding value of "E 
(This is simply a graphic method of linear interpolation.) ES. 
sum of this Added Æ and the primary E,, gives the table ; 
Ears, which may be converted to multiple Ra. by using E for 
If determining partials, also enter 1 R? whenever neede 
the partial formulas. В iple 

4. Follow the Work Sheet getting successive rs first 
the same manner, always using the higher value of t h step 
two columns as the primary. For safety, check вас roro 
before going on to the next, to avoid compounding an ёг 
the higher stages. he 

5, podia the partials by the formulas at the end ul 
Work Sheet. 


s in 


Minimum Intercorrelation to Use 


ds 

On the charts it will be seen that the value of Жайы ү, 
to increase rapidly as the intercorrelation Æ approac 2 con" 
Because of the unreliability of low values of т, two rules 
servative practice are suggested: у that the 

1. The intercorrelation Æ should never be so small t whic 
Added ЕЁ comes out greater than the secondary ae 
it is derived. That is, a variable with an E of 10% 2A to t 
itself cannot contribute more than an Added Z of 1075 erco! 
multiple. (This rule is invariably violated when the int 
relation Ё is taken as a flat zero.) han the 

2. The intercorrelation E should never be less “Fisher” 
value of E which differs significantly from zero by. ; from 
t-test. (This depends on the number of cases, varying 
3.8 for 50 cases to 0.4 for 500.) 


A QUICK METHOD 275 


Sample Problem 
P a 44 i. .50 ae 56 Ls 26 
7 8 r 4257 32 
(r matrix) i e "а 
42 r 20 
CT—————— pj- 
r 32 
dz- 
Multiple R by computation = .551 
Е 1021 Е 1340 Е 17.15 Е 3.44 
ag ab— ac—— ad- 
E 972 E, 9.25 E 5.26 
(E matrix) Lage T е = 
E 925 E 202 
(o ——— 0d—— — 
Е. 5.26 
E daz———— 
ES 1021 E 972 E, 1340 Added # 325 E, 13.46 
005 m gus л, 100 Added Е 155 Е. 1870 
os E. 9.25 E е 18.70 Added E 148 s 14.94 
S m» ngo s, DA Added Е 068 E, 59i 
09—594 „эш By, 1870 Added Е 000 E, 5:94 
en 494 p 526 Е, 594 Added EIA Pag 16.34 
* used as primary Rig 
T . . . 
Sh chart-derived multiple R of 548 in this instance differs by 
for 003 from the multiple R of 551 obtained by computa- 
p n "The charts used were drawn on 84 x 11 cross-section 
е urate interpolation than the 


Charts Which permits more acc 
will $ printed in this Journal. 
€ furnished by the author on re 


(Prints of these larger charts 


quest.) 


276 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Е 


PRIMARY 


F м 
5 


Figure I—Interpolation Chart. 


297 


A QUICK METHOD 


Figure II—Chart for Primary E of 21. 


га 


T; 


N 


PR 
тит 


i 
б K EP 
й ыт 
MCE 
Vana a ADDED E gs 


Figure III—Chart for Primary Ё of 5. 


A QUICK METHOD 


279 


280 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


WAS 
ПОРОДА 


О D 
A A 


Д 
177 
TOGA 


Wi ALAA 


281 


A QUICK METHOD 


LN N ON сар 
АЕН 

IS 

H LANH BEER 

5 FCRC ЕЕЕ 8 


сЕ 
К SE 
NSS AAR 


BB УМ S 
РЕКАМ О 
ЗЕЕ BSBSSSN 
ELLLLLLILIS o 
8 Ps INES 


ш 
/, 


A 

V 

GY 
2. 17, 


f 20. 


Figure VI—Chart for Primary Ё o 


282 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


INTERCORRELATION 


30 20 15 


(ШЕ! 


H 


a 
o 
SECONDARY 


25 


20 


10 


FOR 


5 


10 


ADDED E 
15 20 


PRIMARY 


30 


25 


Figure VII—Chart for Primary E of 30. 


30 


283 


A QUICK METHOD 


MH 


ES i UT 
N CIN [| 


Е ААА М 


MEE МЕЗЕК 
БЕ са КЕК H 


: x 
PSN, | T 


seas ` 


VIA 
LA 


SSNs 


NN 
IN 


22 


Si 


SSS 
SS 


LL] 
" 


Sd 


BS 
[] 
N 
[| 
N 
|] 
|] 


LA] 
MM 
Д 


ay 


Primary E of 40. 


Figure ҮШ cnet for 


N ЕЕС = 


284 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 1 
Comparative Values of т, E, and (1 — К°) 


E-100(1- V1- f£ 


r Е 1-R? r E 1-R? т Е 
01 001 .9999 36 6.70 .8704 171 2958 
02 002  .9996 37 710 .8631 .72 30.60 
:03 0.05 .9991 .38 7.50 .8556 .3 31.66 
04 008 9984 39 792 .8479 .74 32.74 
05 013  .9975 40 8.35 .8400 75 33.76 
06 018  .9964 41 8.79 .8319 .76 35.01 
07 025 9951 42 925 .8236 .77 3620 
08 0.32 9936 43 972 .8151 78 37.42 
09 041 .9919 44 1020 .8064 .79 38.69 
10 0.50 .9900 45 1070 .7975 80 40.00 
ll 061 .9879 46 1121 .7884 81 4136 
2 072 .9856 47 1173 .7791 82 42.76 
3 085 9831 48 1227  .7696 .83 4422 
4 098 „9804 49 12.83 .7599 84 45.74 
AS 113 977$ 50 13.40 .7500 .85 4732 
16 129 9744 51 13.83 .7399 86 48.97 
147 146 971 52 1458 .7296 .87 50.69 
18 163 9676 53 1520 .7191 88 52.50 
9 182 9639 54 15.83 .7084 89 54.40 
20 202 9600 55 1648 .6975 90 56.41 
21 223 9559 56 1715 .6864 91 58.54 
22 245 9516 57 1784 .6751 92 60.81 
23 268 3947 58 18.52 6636 93 63.24 
24 292 9424 59 1926  .6519 94 65.88 
25 318 9375 60 20.00 .6400 95 68.78 
26 344 9324 61 2076 .6279 96 72.00 
27 371 971 162 2154 6156 97 75.69 
28 400 9216 63 2234 6031 98 8010 
29 430 29159 64 2316 .5904 99 8589 
30 461 910 65 2401 .5775 
31 493 9039 66 2487 .5644 
32 526 8976 67 2576 5511 
33 560 8911 68 2668 .5376 
34 5.96 8844 69 2762 5239 
35 633 .8775 .70 28.59 .5100 


dary—Use Jasper 
as primary 


hs Oat ult da 


| 


p e 


Б бу 
H 
t3 


Hr 
by bot iu 


| 


by d by 
1 І 


| 

\ 

f Primary & secon- 
аб, 

t 


| 


ty 


| 


ty 


| 


ty 


| 


t 


| 


ty 


| 


[с] 


| 


ts 


| 


& 


| 


by 


| 


I] 


| 


rr 


ty 


Bi 


| 


A QUICK METHOD 285 
WORK SHEET 
—— E 
2 40——— — ad— — [d 
E. E 
bd— — be——— 
E 
cd—— — o— 
E 
de—— 
Inter- Added Multiple Multiple 1р 
correl. E E R - 
E E 
a —— deo abz— —— 
E E 
а — — [anm 
B aec Еол ^ se === 
B eee —— Fxg = 
R — Bg 
a > R 
e i= — E ibcd3— — abcda—— | ——— 
2..5 к 
E = — B iy з 
Bg — E bcde— — 
E оь —— Шей = abcdez—  ———— 
а= —— B oisi — abce2— — — — 
E ai— id E sas аъй0— = 
5 E 
B p === 0000 
E pamm “== E p= abdew- —— 
E E R 
аб 0 acr— — — d =—==— 
oe ecc Fg 
R 
O —À Fag acdz———  ——— 
Be == Fg 
E 
B ps —— acde— — 


286 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 
WORK SHEET (Continued) 
E E | 
асат- єт- acde- acdez- acdez- | 
Е Е Е R | 
ba- ст bc— bez- bez- 
u^ Bie E recs = E s | 
Е ; EC 
bez — — Е sk E ds bedz -= | 
E ; B 
be— —— E M sso —— E s 
Е boe; 
bee— к x Ё in 
E bed 
bee Ee tede E ies puis. = 
This work sheet is designed for a 6-variable problem. 
For а 5-variable problem cross out all rows having 
an ‘e’ in the multiple. For a 4-variable problem cross 
out all rows having either ‘d’ or ‘e’ in the multiple. 
Partial — 6 variables 5 variables 4 variables 
А 1 um c 1 T5 в 1 a abes 
=й р mU Lo ш 
– рз T -R° 
b 1 ie abcdez 1 1 ЖО дь 1 : аро? 
= =. 71-2? 
DERE ta T-R ao I-A s | 
| 
1-Ё° 1-Ё? LOR ied Р 
b 1- i a 1- ; "d l-7m- Д 
СО“ abdes 7% abdo ава y 
= "alate 1- Esta 
E o. Tia D _ 
abces айса 
1-R2 
abcdez 
Eug m 


BOOK REVIEW 


Yor yard K. Morgan. Industrial Training and Testing. New 
hi cGraw-Hill Book Company, 1945. $2.50. 
Dersonn, pone will be welcomed by persons interested in industrial 
itector | work. The topics covered are those in which a personnel 
Vision in industry is concerned: selection, testing, training, super- 
Servier Service ratings, counseling, follow-up, and the costs of these 
under s. The materials are presented on the whole in a simple and 
inter Standable manner. It is evident that the author is used to 
Er MACH his subject for the general reading public. Certain 
€nts concerning the book seem pertinent. 


Th 


Telationshi e 
Onship between the topics covered 


at 
«ТУ, the reader has to ask himself “for what 


for th is relationship would have been made evi 
for it h analysis of each job before testing, training, 
: "TES a received more consideration. 
Ject with ction on tests, because of the n 
ith relation to the job description, does not seem to be ade 


stanly Covered. The author has reviewed a number of m 
і E ti ardized tests but he does not giye the reader criteria for choos- 
ustry to be used in specific situations. The main criterion in in- 
of the 15, of course, the description of the job based on an analysis 
combin Job, whether by desk audit, questionnaire, ine or 4 
for с сазва of all three. Also, the ae hi п yee nged 
it j 1 idi ecific situ 
ve eda bens ү valer of standardized tests 


t 
in Which. used. He has suggested a 


lack of discussion of the sub- 


&ain th 7 x vd 
See e relationship of the job des ? es no 
to receive adequate attention. However, in the reviewer s 

ar the best portion of the book 
k. In this section the author 


thro 10У the work of the training department affects the worker 
i Here again certain logical 


387 


288 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


be answered—the ways in which the training program may be evalu- 
ated—but he does not get the answer. He can, and should be able, 
to draw such conclusions after reading the chapter on follow-up, but 
he does not know this until he comes to that chapter. That either 
ten or five per cent may be failed, as the author suggests, if scientifi- 
cally based, might be explained by reference to the chapter ОП 
follow-up. 

The section on service ratings or work review describes adequately 
and simply the evolution of rating scales. However, ratings shou 
follow the requirements as defined in the job specification. All jobs 
do not require the same personality factors, work abilities, an skills. 
The worker must be scored on only those factors necessary to © "d 
ency on the job he is doing. Work review is really a part of follow-up’ 

It is, in fact, in the chapter on follow-up that the reader mig s 
have been shown the close relationship between all of the top!C 
discussed. To some extent the author has done this. The chapter is 
entitled “Follow-up—The Key to Training" which is misleading Y 
narrow. Follow-up is also the key to the evaluation of testing; super 
vision, counseling, etc. " Ils 

This book is one of a number in a series which the publisher C$ 
the “Industrial Organization and Management Series.” ‚Опе рош 
of view concerning employee counseling, counseling in industry» е 
presented by this author. Should one read another book in the 52 er 
series, Emyloyee Counseling by Nathaniel Cantor, he would won 
that the two personnel techniques are called by the same names val 

This review should not be interpreted as being wholly cr! This 
Personnel work in industry has mushroomed during the war. nd 
book is another evidence of the need for more careful definition f to 
clarification of personnel techniques and processes as apphe 
industry and it is a justifiable effort to meet that need. nnel 

Great strides have been made in the application of pers ar 
techniques to industry in both World Wars. After А 
there was some evidence of а trend toward personnel work being 
credited partly because some techniques were applied as per tory? 
techniques before they were ready to be released from the laborated: 
and partly because they were not adequately understood or €V2 т, 5o 
There was too much stress on them as miraculous procedures °° roo 
little attention to the need for a scientifically sound backgroun" Lyin’ 
often, top management did not see the scientific basis for арр ork 
these techniques in industry. In many cases today personis ment; 
in industry has not yet proved itself in the eyes of top manage 
Personnel workers who are interested in the continuance О per ifie 
work in industry must take the responsibility for clear pr^ rti 
justification if this work is to be furthered. This book is an € 9 
the right direction. Frances Oralind Triggs. 


Da eS Li a _ сог E ES 
E 


THE CONTRIBUTORS 


Clifford R. Adams—Ph.D., Pennsylvania State College, 1940. 
1g3ching and administration, North Carolina Public Schools, 1921- 
1931; Director of Personnel, Collins and Aikman Corporation, 

1-1935. State Director of Personnel, Pennsylvania State Emer- 
gency Relief Administration, 1935-1936. Assistant State Director, 


esso 
Marri of Psychology, Pennsylvania 


Auth 


testj ^ 
Tage. and marriage problems. М 


5 Tmerican Psychological Association, 


i iversi . Gradu- 
Dorothy C. Adkins—Ph.D., Ohio State University, 1937. Ase 


ate Ассы University 
i pu Ohio State University, ; 
Brant in P iD Cue University, ip rein а 
‘Miner, В camjnations, University of Chicago, 1940. 
Казале Chief. 1940, and. Chief Research and Test ba ag ЕП, 
1ostion, State Technical Advisory Service, Social ois, pee pin 
Pde . Chief, Social Sciences and Administration, a iren 
of nt Unit, United States Civil Service Commission, 1 
resul es on test construction and statistical methods app 
P Associ American a 
Р Se ot сна 1-4 Assistant Managing E 
Poy tometriba, 1938-. Associate Editor o 
HOLOGICAL MEASUREMENT, 1 | 
Joseph ‘versity of Minnesota, 1939. Chief, 
= y of. ) , 
Personnel Testis Unie San Bernardino Air Technical Semet Uu 
ЕД "NEP 
ty Givi Employed by БЕШ ae age Board of Edu- 


ity Gic : by 
n i Service Commission and 


e Edward S. Bordin—Ph.D., Ohio State University, 1900, бк 
the atch Assistant, Ohio State University, 1938-19 in за roam 
ordinator of Student Personnel Services, niversity 
289 


290 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


sota, 1939-1940. Assistant to the Director of Student Counseling 
Bureau (then called the University Testing Bureau), University of 
Minnesota, 1940-1941. Counselor, Student Counseling Bureau, Uni- 
versity of Minnesota, 1941-1942. Personnel Technician, Personne 
Research Section, AGO, War Department, 1942-1945. Senior Coun- 
selor and Assistant Professor of Psychology, Student Counseling ÞU- 
reau, University of Minnesota, 1945. Acting Director of tudent 
Counseling Bureau, University of Minnesota, 1945—. Author 9 
articles on statistical and experimental methodology, researc? 10 
counseling and test theory and analysis. Associate Member, Amer! 
can Psychological Association. Member, Psychometric Societys 
American Society for Aesthetics. 

Mary Edith Bradley—B.A., MacMurray College, 1945. Кең 
sonnel Technician, Illinois State Examining Division, United Sta 
Civil Service Commission. 


. Wilma T. Donahue—Ph.D., University of Michigan, 193 rin 
cipal Psychologist, Bureau of Psychological Services, Instruct 
Psychology and Mental Hygienist in the Student Health Se 
University of Michigan, 1937-1945. Psychologist, Universe 
Michigan Regents-Alumni Scholarship Program, 1943-. Расо 
Bureau of Psychological Services, Institute for Human Adr. and 
University of Michigan, 1945-. Author of professional artic rica 
co-editor of “The Disabled Veteran” in the Annals of the pi 
Academy of Political and Social Science, May, 1945. aa r 
American Psychological Association (Committee on Standar: 
Psychological Service Centers), American College Personnel 
ation, Michigan Psychological Association, Sigma Xi. 


; rate 
Daniel D. Feder—Ph.D., University of Iowa. Associate, Рр 


University of Iowa, 1934-1938. Assistant Director, Personne. os, 
, ty of ШО, 


reau, and Assistant Professor of Psychology, Universi 
1938-1942. Executive Officer and Supervisor, Illinois ae arvi? 
Service Commission, 1942-. On military leave of absence pci 
with the United States Navy, 1942-1946. Officer in charge Research 
Materiel Unit (formerly Training Activity, now part of ion and 
Activity). Officer in Charge to study German Naval Select Mis- 
Training Methods attached to United States Naval Technic pent 
sion in Europe. Author of articles on personnel and measu erica” 


Member, American Educational Research Association, ciation? 
. . B = ^ 

Psychological Association, American College Persoane, As AS 
» ers 


Civil Service Assembly. President, American College 
sociation, 1946-1947, co 

tru 
Edwin A. Fensch—Ph.D., Ohio State University, 1942. Jochen 
tor in German, Ashland College, 1931-1933. Social Science апай 
Mansfield, Ohio, Public Schools, 1933-1941. Psychologist, Me" joo! 
Public Schools, 1942. Director of Research, Mansfield Public 


| 


THE CONTRIBUTORS 291 


oe Author of articles in educational journals. Member, Ohio 
hi Doon of Applied Psychologists, Ohio Education Association, 
1 Delta Kappa, Association of Secondary School Principals. 


Ins Шат Leroy Jenkins—Ph.D., University of Michigan, 1936. 
арор Assistant Professor, Lehigh University, 1935-1943. Re- 
ise Associate, University of California Division of War Research, 
о WM. Supervisor, Training Aids, Columbia University Division 
Prof, ar Research, Submarine Training Section, 1944-1945. Associate 
on eria of Psychology, Lehigh University, 1946-. Author of articles 
р цеш sensitivity. Member, American Psychological Associ- 


2: Welty Lefever—Ph.D., University of Southern California, 
fornia с ember of the Faculty of the University of Southern Cali- 
to t Since 1926. At present, Professor of Education. Consultant 
e Personnel Testing Unit, San Bernardino Air Technical Service 
Terp mand. Author of Predictive Values of Certam Groupings of the 
di: Elements of the Thorndike Intelligence Examinations. Co- 
Or of Principles and Techniques of Guidance. Member, Phi 


арра Phi, Phi Delta Карра. 


Robert J. Lewinski—Ph.D., University of Iowa, 1939. Assistant 


3 Pe h i i 1939. Director and Chief 
Psychol ology, University of Iowa, 1938-1939. : i 
1 ў í Ohio, 1939-1941. In 
ologist, Child Study Institute, Toledo, 1939-1941. Chairman, 


50 
cas ^ Psychology, University of Toledos id, 1940-1941. Active 
o 1 F emi . 
duty in unty Committee on the ith various commissioned ranks, 


the Uni Navy w 
4-19 E qned Бае, ресор Toledo Branch, The Great 
Шү НС and Pacific Tea Company, 1946-. | Commander, HO} 
TE States Naval Reserve. Member, American Psychological fs- 
tary uon, Midwestern Psychological Association, Association of Mili- 
Y Surgeons of the United States, Sigma A 


France ; : 2 -Ph.D., Syracuse University, 1937. 
Dean of Wore ate RT С. Counselor and Remedial Reading 
1 > gene = 

Sota | А» University of Minnesota ultant to 


meng fae of Nursing Education Cae Tests 
i : Clini г, Personnel Bureau, 1 
ology, vun Саш, ersonnel Consultant, Social бс 
Socia |, American University and Amed ap pa ; ia өз 
ions. Summer teaching, personne n: er 
ae iversi S, per^ of Washington. Author o т 
drove pot versity of V j f 
nove Your Y University, ML Your Spelling, Remedial к Ry 
p WBWnosis a d ang, IMP Reading Dificulties at the College ae 
ц ла Correction е uthor of articles in technica’ 


lin 


[О 3 5 
| оштар Work in Schools of М eny, of Political and Social Sci- 


a Я 5 
S. Member, American Acade ‘American Educational 


» American College Personnel ‘Association, 


292 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Research Association, American Psychological Association and other 
learned associations. 


Maurice E. Troyer—Ph.D., Ohio State University, 1935. Super 
intendent, Bureau of Township Schools, Princeton, Illinois, 1925-1929. 
Assistant Professor of Psychology, Bluffton College, 1930-1932. i ; 
structor in charge of Remedial Program, Ohio State University, 1957720 
1936. Assistant Professor of Education, Syracuse University, 
1936-1939. Associate Professor, 1939. Associate in Evaluation, - 
Commission on Teacher Education, American Council on Educates i, 
1940-1943. Director, Bureau of School Services, Professor of ЕШ, * 
cation, Syracuse University, 1943. Director, Evaluation Serv Я 
Center, Syracuse University, 1945. Member, American Psycho окси d 
Association, American Association of Applied Psychology, Amer 
Educational Research Association, American Association for the AC d 
vancement of Science. "il 


(Mrs.) Alice Van Boven—M.A., Claremont College, Claremore? 
California, 1934. Statistician, Personnel Testing Unit, San | 
nardino Air Technical Service Command, 1943-. 
f 

Karl Peak Zerfoss—Ph.D., Yale University, 1930. Professor лв 
Psychology and Director of Graduate Placement, George iy gsocl- 
College, 1930-. Author of articles on guidance. Member, 1061815 
ation of Midwestern College Psychiatrists and Clinical Psycho Coun- 
Illinois Association for Applied Psychology. Fellow, Nationa 
cil on Religion in Higher Education. 


("t 


THE VALIDITY OF WRITTEN TESTS FOR THE 
T SELECTION OF ADMINISTRATIVE 
PERSONNEL 


MILTON M. MANDELL and DOROTHY C. ADKINS: 


United States Civil Service Commission 


I. Introduction 


PrRHAPS the most neglected and at the same time most 


cies problem in the field of personnel selection is how to 
Ose among applicants for administrative positions. The 
ee is critical because administration of poor quality can 
erly d y impede production, whereas correct decisions prop- 
"хай imed and executed effect almost unbelievable savings in 
ime, manpower, and money. Whether an industrial organiza- 


ti ; 
9n builds up a profit, or whether a Government agency suc- 


cessfully defines and prosecutes a program is dependent in large 


Measure upon the quality of its administrative staff. 
эы ре the obvious value of discovering effective objective 
En. for selecting capable administrators, several factors 
redu to have led investigators to avoid this field and to have 
ced the potential effectiveness of the studies that have 
Сеп made. In the first place, the boundaries of positions to be 
ны as administrative are Vague. Hence defining the 
Characteristics of the positions to be grouped together for selec- 
Чоп Purposes, or for comparing the effectiveness of different 


S . 
election methods, is at best difficult. In the second place, there 


appe : "^ 
ars spect of obtaining agree- 
“PPears at first thought to be little prosp Б аё 
1 " 

actiy The writers wish to express their appreciation to Dr. T. L. Bransford, who gave 
in the support to this study throughout, to т. Samuel 5. Board, who participated 

i € initial planning, to Dr. Herbert $ Conrad, who assisted in planning and car- 
iong Out the statistical analysis of results, and to Mrs. Jeanne Davis, who was 


mmedj Š 
ediately responsible for the statistical work, involved. | 
nd a discussion of some of the problems in 


Or a summary of the literature а 
d, see Mandell, Milton. Testing for Administrative and Supervisory Posi- 
EpucationaL AND PsYCHOLOGICAL MEASUREMENT, V (1945), 217-228. 

293 


this fiel 
tions.” 


294 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ment on what constitutes “success” for persons in administra- 
tive positions. For this reason the problem of setting up a 
reliable criterion against which to appraise the effectiveness of 
various tests is especially forbidding. The investigator is apt to 
feel, and often with justification, that his test is a more de- 
fensible measure of job performance than any independent 
criterion measures he could be likely to obtain. A third deter- 
rent has been the emphasis on personality factors in relation. to 
Success in administrative work. Recognizing that objective 
tests of such factors that would be suitable for use in com- 
petitive situations have not yet been developed, investigators 
have tended to avoid trying out tests of other factors for which 
appropriate tests have been or could be developed. Finally, the 
relatively small number of administrative positions has led test 
technicians and psychologists to concentrate their efforts ОП 
occupational fields such as the clerical, where mass recruiting i 
more frequently needed and where the likelihood of positiv 
results has at the same time appeared to be greater. 


П. Purpose 


The United States Civil Service Commission has recently 
completed its initial study of the validity of written tests 
the selection of administrative personnel. It faced this tas 
with considerable skepticism both because of the dearth © 
existing tests and the difficulty of devising tests that арреаге 
promising for this purpose, and because of the problems in. ч 
tempting to obtain reliable criterion measures for а sufficien 
number of personnel to make the study worth while € 
statistical point of view. It nevertheless recognized the ш 1 
portance of even negative results in such an unexplored ne р 

The study was confined to an effort to discover valid well 
tests for selecting personnel for administrative positions, be 
emphasis on program planning, formulation of broad polic! Е. 
and large-scale coordination of activities, as distinguished ae 
supervisory positions, where the emphasis is primarily ОП P 
tions with subordinates. The administrative positions stu? 
included both staff and line positions. The purpose of the stu 1 
was to identify tests that would predict competence in 2 $ 


THE VALIDITY OF WRITTEN TESTS 295 


cite ar e regardless of any specialty or technical 
n. vu o involved. It was recognized, however, that 
field scc. personnel selection for a particular specialty or 
Jus ably could be increased substantially by including in 
m ery of tests for selection for that particular field not only 
E, tests which successfully predict aspects of performance 

mon to all administrative positions, but also some tests 
Specially designed to sample knowledge and ability in the 
Special area concerned. 


Ill. Criteria for Choosing Tests 


n not all of the tests that might be profitable for selec- 
?n purposes were included in the study. Those that were 
tried out satisfied three conditions: 
ыг As just indicated, the tests were chosen partly because 
Y Presumably test elements common to all administrative 
Positions rather than elements in special fields appropriate to 
only Particular groups of positions. A very practical reason 
Ог this restriction is that the available sample of subjects in 
each specialized administrative group was too small to yield 
€pendable conclusions as to the value of special tests for each. 
9 attempt was made in this study to include specially designed 
Subtests for each specialized group. : 
M The tests were judged to be not at all or only slightly 
Ject to “fudging,” which would largely negate their value 
Ог inclusion in a competitive testing program. 'This require- 
a that the tests should be of a type such that the subject 
uld not “fake” his responses and thereby get an unjustifiably 
ХЫ atypically high score automatically excluded the bulk of 
Personality inventories.’ Since the major work of the Civil 
Srvice Commission is selecting personnel from among com- 
Petitors, the criterion of usefulness for competitive purposes 
Was an important one in determining the tests to be tried out. 
3. The tests had at least an element of "face validity" or an 
арреагапсе of measuring factors seemingly related to the job. 
Ithough the tests selected for tryout differ in the degree to 


3 This ; Р M A 
Setting. his is not to deny the value of many such tests in a noncompetitive or clinical 


296 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


which they seem to bear directly on job duties and responsi- 
bilities, none was included that would seem to be manifestly 
unrelated to the positions. The “face validity” of the tests was 
taken into account in selecting the tests because of the great 
importance of public acceptability of tests in civil service ex- 
amining. Possibly the inclusion of perceptual tests (such as the 
Gottschaldt Figures, for which Thurstone* obtained some 
promising results) and other nonverbal tests would have con- 
tributed appreciably to the prediction of job success of the 
subjects in this study. Such tests might have yielded a multiple 
correlation coefficient significantly greater than the one ob- 
tained from the best selection from among the verbal types 2 
tests included in the study because of their low intercorrelation 
with verbal tests. Even if their statistical validity were 9 
tablished without question, however, the advisability of using 


dir : ad : е 
them for civil service testing in the near future might b 
questionable. 
IV. The Tests Selected 
: . —- : il 
Five tests considered to meet these criteria satisfactory 


hat meet 


were given to all subjects and two additional tests t adil 
w 


the criteria were given to part of the subjects. The tests 
as follows: 

1. American Council on Education Psychologic 
tion (linguistic ability). This portion of the A.C. 
sists of three subtests, Completion, Same-Opposite, an е 
Analogies, and contains а total of 120 items. It attempts К. 
measure both vocabulary and verbal reasoning ability. It ме 
included because previous studies, by Thurstone and othe E 
had indicated that this type of test is of value in selecting ? 
ministrative personnel. Examples: 


al Examina- 
E. test СОЛ” 


d Verbal 


Completion—Think of the word that fits the definition. Тоде 


mark the first letter of that word on the answer sheet. here 
who departs from a country to settle permanently elsew 


B € D E Е 


Saniz-opposite—Select the word at the right which means 106 


same as or the opposite of the first word in the row. 


as | у 9 
5 Thurstone, L. L. A Factorial Study of Perception. Chicago: Univers? 


Chicago Press, 1944, pp. 133-144. 


THE VALIDITY OF WRITTEN TESTS 297 
Mead 1) angry 2) deliberate 3) tolerant 4) calm 
erbal Analogies—In each row of words, the first two words 


{ог a pair. The third word can be combined with another 
word to form a similar pair. Select the word which completes 


the second pair. 
rehearsal-performance pending 1) temporary 2) accomplished 
3) experimental 4) timely 


This test, constructed by the Civil 


2 
2. Current Events. 
choice items de- 


Service Commission, consists of 40 multiple- 
Signed to test factual knowledge of current social, governmental, 
and economic conditions. Its inclusion was based on promising 
results obtained from its use by the Forest Service of the United 
States Department of Agriculture. It was thought, too, that 
this test might tap some of the same factors as are tested by 
the Social Scale of the Allport-Vernon Scale of Values, which 

hurstone had found to discriminate between good and poor 


ederal administrators." Example: 


To which of the following types of legislation has the phrase 


cradle to the grave” recently been applied! 
A) military service 
) social security 
C) public health 
) civil service 


E) education 
Test of the Progressive Education 


Ssociation. Twenty-five items that had proved ү 
“riminating in a previous study made by the Civil Service Com- 
Mission of the validity of tests for the selection of ee 
tive interns were included in this addy. The Poms cere e s 


"yout of tests for supervisory personnel had also indicated that 


this 
test mi ful 
ght prove изе. РЕР: 
tist; is test consists of groups of statements Es st eadh 
!Stical charts or tables. The degree of truth or іа + 
Statement is to be indicated by use of the following coce: 


3. Interpretation of Data 


Th 
ese data alone 
e the statement true 


are sufficient to mak he statemen 


dicate that t t is probably 


) are sufficient to in 
true 


5 
Thurstone, І. L., op. cit. 


298 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


C) are not sufficient to indicate whether there is any 
degree of truth or falsity in the statement 

D) are sufficient to indicate that the statement is probably 
false 

E) are sufficient to make the statement false 


4, Thurstone’s Estimating Test. This test, consisting of 20 . 
questions, attempts to measure the ability to make reasonably 
close estimates of factual data on the basis of related, but not 
direct, information. It was included because Thurstone had 
found it valuable for distinguishing the better from the poorer 
administrators. His instructions for scoring are to score the 
test on the basis of the percentage right of those questions at- 
tempted rather than the total number of questions correct. 
Since practically all of the subjects in the present study 
answered all of the questions, the score used is simply the total 
number of questions correct. This fact may have some bearin£ 
on the results obtained. Example: 

Estimate the population per square mile in the United States 
in 1940, 

A) 15 

B) 45 

C) 215 

D) 1035 

d 


5. Administrative Judgment Test. This test, prepat 
mainly by the Civil Service Commission,’ consists © 
multiple-choice items which attempt to measure understa 
of administrative situations. Job analysis indicates tha 
ability to analyze administrative problems relating to lineais 5 
relationships, central office-field office relationships, coordi 
tion, and the like, is an important component of administrat! für 
positions. All of the items were reviewed by consultan" is 
high-level administrative positions both in Government ап E 
industry. Obtaining the reactions of the latter group ws We 
sidered especially important in order that the suitability of t 
test for open-competitive selection could better be ed 
The split-half reliability coefficient for this test of 94 indica” 
that it gives satisfactorily consistent results. This test W 


nding 
ї the 


6 Thurstone, L. L., оф. cit. 
* Fifteen items included in this test were made available by the 
Board for experimental purposes. 


Social Securit 


THE VALIDITY OF WRITTEN TESTS 299 


sco > 

pom н basis of the percentage correct of the number of 

са а empted rather than the total number of correct 

iio 4 , since not all subjects finished the test. All, however, 
ttempted fewer than 50 questions were eliminated from 


the study, Example: 


Whi i idi 
p ane of the following administrative situations or prob- 
m y l most probably occur when direct relations are per- 
ee Anime a staff specialist employed by the national 
: an organization and the o erating officials employe 
In the field offices? " I 
A) decrease in the feeling o 
office specialists for the operations о 


their specialties 

B) inadequate technical supervision of field office opera- 
tions 

C) inadequate knowledge in the national office of the com- 
petence and qualifications of field office personnel 

D) difficulty in keeping the relations on an adyisory basis 

E) subordination of professional considerations to general 

administrative responsibilities 


pa 6. Agency Organization and Personnel Test. This test, pre- 
red by the operating agencies concerned in the study and the 


‘vil Service Commission, consisted for each agency of 15 
1 knowledge of the functions, 


multi К : 
ltiple-choice questions on factua c 1 
in which the subjects 


Ww . H 
‘ste employed. The possible value of this type of test was in- 
and Richardson, which demon- 


di 
fe. by a study of Uhrbrock » non- 
the validity of a similar test for supervisory selection. 
a Ino tunately, the time available for testing permitted the 
Inistration of this test to only a part of the subjects 1n our 


Stu 
dy. Example: 
nk System is unde 


The Federal Home Loan Ban 


) Fi 1 dministration { у 
ederal Housing A es A ministration 


B) Federal Home L 

me Loan i 

C) Federal Public Housing Authority 
) Defense Homes Corporation - 

7 Home Owners’ Loan Corporation 4 

` Civil i 'esion revision of the Allport-Vernon 

d Ae of the Scale of Values, ques- 


Cale xis 
of Values, In this revision 


8U 
structing rocks R. S. and Richardson, M. W; 
933), Mie ior Forecasting Supervisory 


f responsibility of national 
f state programs in 


r the 


nalysis: The. Basis for Con- 


« 
tem А: 
1 » Personnel Journal, 


Ability. 


300 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


tions of political or religious significance were deleted because 
of their unsuitability for civil service testing. Although the 
questions used are similar to the remainder of those in the 
Allport-Vernon Scale, an attempt was made to achieve greater 
“face validity” and also to sharpen the definition of the Social 
scale. The revised Scale of Values yields four scores, Theoret- 
ical, Economic, Aesthetic, and Social. Thurstone's results in- 
dicated that the Theoretical, Economic, and Social scales dis- 
criminated between the better and the poorer administrators.” 
Since the discrimination of the Economic scale was negatives 
however, it is doubtful that it could be used in a civil service 
setting. Again, not all of the subjects were able to take this 
test. 

Four of the tests included in this study, Current Even 
Administrative Judgment, Agency Organization and Personnels 
and the Civil Service Commission revision of the Allport-V erno” 
Scale of Values, were constructed specifically for possible use 
in the Federal Government. It seems probable, however, that 
similar tests designed for the selection of administrative Pe 
sonnel in other situations where comparable standards of рег" 
formance apply should yield substantially similar results. 


ts, 


V. The Subjects 


The subjects for this study were employees of t 
agencies—the Office of the Administrator, Nationa 
Agency, and the Federal Public Housing Authority.” Results 
from the two agencies were combined since the samples sem 
too small to warrant separate treatment. In order to facilitate 
an interpretation of the results, however, the total sample from 


wo Federal 
1 Housing 


the two agencies was divided into three groups on the basis he 
types of positions currently held by the employees. pa o ie 
whl 


data were analyzed separately for each of the groups, 
may be identified as (1) Top-Management, (2) Staff, a 
Technical. 

1. The Top-Management Group consisted of employees Ber 


nd (3) 


э Thurstone, L. L., op. cit. -hard 
. 10 The Civil Service Commission is greatly indebted to Lyman Moore, Rich 
Niehoff, Felix Nigro, Dorothy Boyce, Charles Stern, and Dale Noble of these age 
for their cooperation in providing the subjects and obtaining the criterion га 
that made this study possible. 


THE VALIDITY OF WRITTEN TESTS 301 


ceivin з 
"i Secr qan $6,200, to $10,000 and occupying positions 
[ре Fuser ——— for directing major segments of 
lim al pui. Sau They had broad policy-making, plan- 
über нЕ тан н. responsibilities. In terms of total job 
б багым aon responsibilities would not be considered 
іне in = their administrative duties. The number of 
data wer lin abe z whom complete test and criterion 
а j^ ped Group consisted of employees who had salaries 
eld of le es n $7,500 and who were engaged in the 
Prid cia is budgetary analysis and procedures, or ad- 
‚кето, analysis and procedures. Although they are ad- 
their werk iiie aai rather than line-operating officials, 
trative нл ана recognized as falling within the adminis- 
a. here were 63 employees in this group for whom 


com 1 
plete data were available. 


3.. Т : 
he Technical Group was composed of employees en- 
atistics, architecture, law, 


РЕ 

uim professional fields as st? 

in са engineering. These employees were not engaged 

of includi rative work at the time of this study. The purpose 

viously эч them was to determine which, if any, of the pre- 

trator oo tests might help 1n the selection of adminis- 
om among persons currently occupying technical 


Positi 

10 11 5 . . ы 

ns. For this reason, the criterion for the Technical 
nce in administrative work 


г 
rather = predicted performa 
Ployees y an performance 1n the types of work in which the em- 
other ¢ vere actually engaged. In contrast, the criteria for the 
Wo groups of employees were based on performance in 
Present positions. Although this aspect of the criterion 


Or н ] 
the Technical Group renders interpretation of the results 
two groups, it seems to be 


mo ; 

бык difficult than for the other | 
Е. ed in view of the purpose of the study. The Technical 

UP contained 90 employees for whom both test and criterion 


ata 

we 

—— Were complete. 
ng them was not to 


loyees for technical 
ifferent battery of tests chosen with that 
of the professional sub- 


n 
1t should be specifically noted that the reas 


stm 5 
Positions. which of the tests WO 
articular Had that been the purp" 
Хора within In view would have be 

ithin the Technical Group- 


302 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


VI. Criteria of Job Performance 


Three types of criteria were used in this study, although not 
every type was applied in the case of each of the three groups 
of subjects, for reasons that will be explained in Section VII. 
The three criteria used were (1) graphic ratings of job per- 
formance, (2) paired-comparison ratings of job performance, 
and (3) salary, with age held constant by the partial correla- 
tion technique. 

1. Graphic Ratings—The instructions used in obtaining 
both the graphic ratings and the paired comparison ratings аге 
given in the Appendix, together with the form for the graphic 
ratings." It provided for ratings on six elements and an over- 
all evaluation. The ratings were made on 5-point scales labelled 
1, 2, 3, 4, and 5, with point 3 being defined as "satisfactory 
performance. Only the over-all evaluation was used as the 
criterion. It was thought, however, that provision for the 
ratings on the separate elements would tend to yield greater 
comparability and hence reliability for the over-all ratings. All 
subjects with fewer than two ratings on this scale were elim- 
inated from the results based on this criterion, with a view to 
increasing the criterion reliability. ; 

2. Paired Comparison Ratings.—For the paired-compar! 
ratings, only subjects in the same group, Top-Managemen® 
Staff, or Technical, were compared with each other; in ome 
words, a Staff employee was not paired off with a Top-Manase 
ment or a Technical employee. All cases were eliminated from 
the study for whom there were fewer than eight comparison? 
with other employees available. This minimum number ? 
eight comparisons for each employee retained may have p 
composed of comparisons made by a single rater against €!8 j^ 
other employees or of comparisons made by more than one rum 
against fewer than eight employees. Cases with fewer tha? 
eight comparisons were excluded from results based on this 
criterion in an effort to insure at least moderately satisfactory 
criterion reliability. 

The paired-comparison ratings for each employee ў 


son 


ere con- 


; а Р М 7 use 
12 These instructions will also serve to illustrate some of the precautions 
to preserve the morale of the group of employees being tested. 


і THE VALIDITY OF WRITTEN TESTS 303 
erted to У Š 
Biche init score used by computing the perc 
hich Ke wn vem of comparisons made for that ares 
the dun! nstant.—lt seems reasona - 
Disitions om is related to performance in erie 
nig correlate positively with position grade in the 
Bui te vies i and hence with salary, other things bein 
salary ipie " the appreciable correlation between age zd 
ciently pronou өү for which grade differences were suffi- 
сь ite nced to warrant the use of grade or salary as a 
Fon the bo as considered advisable to make some adjustment 
out age E digi Es correction was effected by partialling 
| < А Á 
is gue the validity a ст ae 
th et st for predicting job performance 
test and th ot only on the intrinsic relationship between the 
of the d but also on the reliability of the test and 
to го the particular conditions of this study that led 
еы careful ratings should be mentioned. 
the two; stic support was given by a high official in each of 
R gencies from which the subjects were obtained. These 


fic 
lals Ww 
ro ; 
tequesting Fs a personal letter to each of the subjects and raters 
eir cooperation. Perhaps even more noteworthy, 
ers. The results 


the 
of Кока participated as subjects and rat 
9f support y probably were affected significantly by this type 
to the чиай Moreover, the letter transmitting the rating forms 
indica emphasized the importance of conscientious ratings 
ted that no rating was preferable to a rating not 


e *cting the best j 
e best judgment of the rater. 
n a supervisory rela- 
j It 


long as the quality of the 
f this consideration, 
rated certain 
served suffi- 


rati 
ngs f : Е 
atings tin each subject as possible so 

as not unduly reduced. In view o 


Som 
i ff positions also 
felt they had ob 


zi 518 
Cite This enteron "mao . 
d s riterion is somewhat similar Г y Thursone in Pf ge) 
i into four age 8 - 3 
F his age group and in 


тёр. 


304 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ciently, even though the subjects were not their subordinates or 
superiors. Although no direct evidence is available on this 
point, the assumption that this procedure increased the re- 


liability of the ratings appears reasonable. 


VII. Results 


It would have been desirable, of course, to lengthen some of 
the tests considerably for this tryout. In view of the limited 
testing time, however, it was not possible to make every test 
long enough to yield sufficient reliability for individual predic- 
tion. The preferable course in this initial study seemed to be 
to try out several types of tests at the expense of low reliability 
for some of them. Any short unreliable test that gives promis 
ing results can be improved by lengthening it by the addition 
of comparable materials. 

Reliability coefficients estimated by the Kuder-Richardson 
formula (21)'* for the three groups of subjects are given 10 
Table 1 below. 


TABLE 1 
Estimated Reliability Coefficients (Kuder-Richardson Formula 21) 

Top-Management Staff Technical 
LT N rt N rit N 
ACE. CQinguistic): issis seins » nox » B 8 P 
Current: Events! „= вв sa кы» .79 20 82 63 62 2n 
Interpretation of Data dese Ө 20 74 63 71 25 
КЕШЙП сагаа гнала 28 20 42 63 33 9 


Agency Organization and Personnel .62 14 64 35 ‘67 52 


The reliability coefficient for the Administrative Judgment 


Test, estimated by the split-half method for a total group ч 
258 cases on which scores were available, was .94._No attemp 
was made to estimate the reliability of the Civil Service Co" 
mission revision of the Allport-Vernon Scale of Values, W ш 
was taken by only 22 subjects in the Technical Group. 
Although the correlation coefficients to be reported h 
been corrected for test unreliability, the reader та; h 
to take the foregoing data into account in interpreting k 


ave not 
y wis 


» í 0! 
14 Kuder, G. F. and Richardson, M. W. "The Theory of the Estimation 
Test Reliability.” Psychometrika, ЇЇ (1937), 151-160. 


THE VALIDITY OF WRITTEN TESTS 305 


et The correlational data for the three groups are as 
Mke ee Group.—Table 2 presents means, 
ile а etn an and Pearson correlation coefficients with 
eiie d graphic rating for the six tests for which data were 
Bowers x the Top-Management Group. These coefficients 
mier on the original data. They are not corrected for 
bet en = in either the tests or the criterion, and they are 
С ы е original test content including all of the items that 
administered to the subjects. 

ps the graphic rating criterion was used for this group. 
ihe pa comparison technique could not be applied because 
to чога er of pairs ог comparisons per subject, was too small 
ie to any expectation of criterion reliability due to the 
irs number of subjects in this group. Neither was use of 
Bell Lr age constant, as a criterion for this group considered 
ecause all of the subjects fell in the three highest 


classificati 
lassification grades. 


TABLE 2 
Test Data for the Top Management Group, with Over-all 
Graphic Ratings as the Criterion 


N Mean* Sigma Validity r 


Test 
Curren (Linguist m "NE 1 sf 
i RAUS - EE 648 4 
Paral 12. { { 
Шашпа 7 дды = 20 бю (Н 10 
Agen trative Judgment ......:*77 20 59.6: 4 4 
Су Organization and Personnel .- 14 12.79 2.11 66 


rms of raw scores except for the 


* 

The "m 7 

i m ns are in te! f 
eans and standard deviatio based on the percentage right ol 


Admin; 
итар; x 
th rative ] which they are basee, 
sta tad Si Tideman Ten ae . The mean criterion score was 3.78 and the 
ard deviation 0.76. 
at are sur- 


Five of i ‘eld validity coefficients th 
Кабо small sample and the rela- 


Prisingly hi h. i жа of the very 
ы low тышке of some of the tests. The magnitude of 
ү Validity coefficients, which is about as high as or perhaps 
oer than that generally obtained for written tests for any 

Сирацопа] group, indicates that the tests are probably mea- 


si i ii i - ement 
Po ng important factors !n job success in top manag 
Sitions 


306 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


In view of the small sample for the Top-Management 
Group, no test intercorrelation coefficients and no multiple 
correlation coefficients were computed. No attempt was made 
to estimate the reliability of the criterion, although the fact that 
the validity coefficients are as high as they are is itself evidence 
of at least moderate criterion reliability. $ 

2. Staff Group.—Table 3 presents means, standard devi- 
ations, and criterion Pearson correlation coefficients for the five 
tests for which complete data were available for the Staff 
Group. The criteria used for the Staff Group were (a) salary, 
with age held constant by the partial correlation technique, ап 
(b) the average of the paired comparison rating and the over- 
all graphic rating. Correlation coefficients of the tests with 
salary and with age are also reported in Table 2. 

For the first criterion, the data at hand were position grades 
rather than salary. Position grade is probably preferable ta 
actual salary as the basis for this criterion for the reason that 
within-grade salary differences depend more upon length of ser 
vice at the grade than upon competence. The criterion 3 
referred to as salary rather than grade to provide ease of inte" 
pretation. 

The second criterion was obtained by combining the two 
types of ratings in order to increase the criterion reliability- 
The correlation between the two ratings was .65. Because di з 
ferent persons rated the various subjects, there was no o 
factory way to estimate the reliability of the separate F^ 
by each of the two methods. Lacking a precise estimate of t У 
reliability of each, the best solution seemed to be to combin 
the two at equal weights into a single criterion score. _ ple 

As in Table 2, the correlation coefficients reported in Ta 7 
3 are not corrected for attenuation and are based оп the tee 
test content as administered to the subjects. -ats 

Table 3 indicates that the tests have correlation coefficie? 
with the criteria that on the whole are probably significant 
greater than zero.^ For the test lengths as used in this St? 1 

15 А В zero 15, Ad 
These dM ae cete cient of 25 die, “er 


cantly from zero, and 997 chances in 1000 that a correlation coefficient of - 
significantly from zero. 


THE VALIDITY OF WRITTEN TESTS 307 


TABLE 3 
Test Data for Staff Group, with Two Criteria. N=63. 


| 2 3 * hk & B 


l ACE. 

2, c(Linguistic) .. . 64 а 36 69 38 -0 43 30 

3 Current Events. 64 .. 55 33 69 6 15 46 26 

+ interpretation 

4, ро Data ...... 6 55 37 56 42 00 48 Al 

© astimating ..... 36 38 57 47 130 -.04 36 29 

- Administrative 

є шет... 69 69 56 47 .. 56 -05 6 49 

* Salary 

7 a CAF Grade) o 38 6 42 30 46 — 46 са oi 

Раанана 02 ds 00-00 -=05 46. OG 

á Salary, with ё r 25 
ge Constant. 43 6 48 36 | de^ ity F 

yy Combined Rating a a 4 0 49 M 06 3 0 

c NM 8462 2098 1217 640 5724 10.17% 3413 .. 7262 

tandard Deviation. 1817 711 468 269 1107. 2.62" 754 „1122 

basis 


Коли» 7.68 


These are tl d standard deviation in terms of position grade in the 
ear (Clerical-Administrative-Fiscal) service. The entrance salary corresponding to 
* CAF-10 grade at the time of the study was $3970. 


the Administrative Judgment Test is best for predicting job 
Performance and equally as good as the Current Events Test 
°F predicting salary, with age held constant. The Interpreta- 
“on of Data Test is second best for predicting job performance 
апа third best for predicting salary, with age constant. 

For 35 cases for which data were available, the Agency Or- 
Senization and Personnel Test correlated .35 with the combined 


Tatings of i : \ 
Тһе mid СОР coefficient of the five tests in m 
With the combined ratings of job success was .55, т п. 
With 49 for the Administrative Judgment Test a M e 
Multiple correlation of the five tes’ with salary was .68, as 
Compared with .65 for the Current Events Test. Itis not ap- 
i of the tests with the 


Seco 1 . B . 
nd criterio hich Jf invo. : 
1 n, which 1ts€ j eu 
View of the ahynkage which occurs im а multiple correlati 
is an ated on а d 
i e 
d either of these П 
he correlations for the best 


and with salary. 


Vesti experiment is repe 
lon 8ators would not regar Е 
Sin 5 as substantially higher than 1 
gle tests with the combined ratings 


308 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Tt should be noted that the correlations with the combined 
ratings are based on evaluations made by 55 different raters, 
each of whom rated one or more of the subjects. Although 
the writers know of no statistical theory or technique for sub- 
stantiating the hypothesis, it is suggested that 55 raters who 
furnish the criterion data for 63 subjects may provide a more 
rigorous test of the validity of a measuring instrument than à 
small number of raters who may give biased ratings to many 
more candidates. 

The correlation of .56 between the Administrative Judg- 
ment Test and salary (which, for practical purposes, is equiva- 
lent to grade level) indicates that higher test scores may be 
expected as grade level increases. Thus if such a test were 
used for selection purposes for various grade levels, the data 
would argue for setting progressively higher cutting points as 
grade level increases. 


TABLE 4 
Performance of Staff Group on the Administrative Judgment Test. N=63. 
Combined rating 
Test scores Lowest Middle Highest 
6 26 31 
(unsatisfactory) 
1 17 31 
5 9 0 
: ivel 

Table 4 was constructed on the basis of such progressive y 


higher cutting points. Certain grades were grouped togct; 
because of the small number of cases per grade. Four passing 
points were set in such a way that all of the subjects m 
upper half on performance exceeded the passing point on f н 
test for their particular grade levels. With such cutting point? 
only one of the six subjects who were rated as “unsatisfacto” , 
in performance exceeded the critical score for his grade. pail 
the procedure used takes advantage of certain chance erro 
in the data, a similar table for a new sample of subjects 
based on the same cutting points might not show up so 
Although the desirability of obtaining such a table for 
sample will not be denied, the writers believe that the prese 


well: 


а new 


THE VALIDITY OF WRITTEN TESTS 309 


tabl : 

aim aL MS ена picture of the order of the discrimi- 
яна a e expected to yield. 
id desintiom ic ie 5 and 6 present means, stand- 
Techies] G: , ra earson correlation coefficients for the 
Eu. onp or the five tests indicated with the over-all 
и аас paired comparison ratings, respectively, as 
чы & "A are reported separately for the two criteria 
siderably $ be er of cases would have been reduced con- 
oth tapes cx y those subjects had been retained for whom 
CAR Mii ratings were available. The extent to which the 
Bon af prc are attenuated as a result of the raters’ consider- 
their s sent performance in technical positions in making 
Were des; ci not known. The ratings intended, however, 
tion in CCS to predict performance in an administrative posi- 
than half e eque knowledge and judgment would be less 
uture perf the total job content. Since any prediction of 
Rus de ormance is based, to a great extent, on present be- 
Ported are can be expected that the validity coefficients re- 
smaller than the situation justifies. 
© ани shows correlations with the over-all graphic rating 
ervice Can, Organization and Personnel Test and the Civil 
Dus, ааа де revision of the Allport-Vernon Scale of 
and 6 ес numbers of cases than those on which Tables 
я sane ased. Not all subjects took these two tests. 
ersonnel гн, the correlation of the Agency Отоо one 
roup я est with paired comparison rating for the Technica 
as .47, based on 43 cases. 


TABLE 5 


Over-all Graphic 
=90 


T'es 
t D, х : ous 
ata for Technical Group, wath Rating as the Criterion. 


А т 2 85 # 8 6 
` AC Е 

Pi OnE: (Linguistic) ....<88*9""* x sg 6 49 68 39 
3. Tent Event 58 —. o6 al 50 25 
4 Iterpretation of Dara à xe e 36 Зо 53 3 
5 Estim. tation of Data e 61 0 23 07 
$ Adige о... D 4 D ox uo 
M hic R : i 27 
ee. A асани 


310 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 6 
Test Data for Technical Group with Paired Comparison-Rating as the Crite: 


rion. 


1 ай k 5 OS 
1. ACE. (Linguistic) ш of @ Ø H T 
2. Current Events ......... 57 gs 42 “1 43 38 
3. Interpretation of Data .. 62 42 T 33 48 12 
4 Estimating ............. 49 3l A u Æ CX ‘ 
5. Administrative Judgment 61 43 48 22 m 2 > 
6. Paired Comparison Rating ....... 28 34 38 1 25 7 
Mean, зо senasieminnss diit ... 75.70 2195 12.04 6.83 50.13 27 
Standard Deviation 468: 439. 275, 1L05 2550 
eir 


Considering the results obtained from both criteria, the tests 
that seem to offer the most promising measures for the selection 
of administrators from among technicians are: ACE. Cat 
guistic), Interpretation of Data, and Agency Organization а" 
Personnel. As mentioned before, the latter test would not be ЖЕ 
suitable for use in open-competitive examinations. 

The correlations for the revised Scale of Values, W 
on only 22 cases, are interesting. The negative discr 
of the Economic scale agrees with Thurstone's finding." Е 
relative order of the positive discrimination of the Theoret! 
and Social scales reported here is the reverse of that foun? | 
Thurstone, who found the Social scale to be most discriminat!” 
ofall. Perhaps it should be noted that a test of the type of n 
Allport-Vernon Scale may be more subject to “fudging” than 
is desirable for a test used in a civil service jurisdiction; E 
though it appears to be less so than many personality and a 
terest schedules. And, as was mentioned earlier, a civil serv? | 


EN фу Я Р se 0 
jurisdiction normally does not consider practicable the u | 


hile based 


imination 
he 


TABLE 7 
Additional Test Validity Data for Technical Group 


Graphic ratings 


35 (5 252 


1. Agency Organization and Personnel ........- 
2. C.S.C. Revisi f A.-V. Si у; 

A. Theoreticil ......-- side с. Ре жа 42 (22) 

E Tronomig se 

. Aesthetic .. . 

DE M P IMMINET 17 (N=22) 


16 Thurstone, L. L., o. cit. 


311 


THE VALIDITY OF WRITTEN TESTS 


ef test for personnel selection pur- 
Sider: tion 1 е correlation 1s high. Such a con- 
ойра m ably would preclude the application of the 
Ti he case of the Economic scale. 
n c for the Technical Group are lower than 
теби К for the other two groups; one may speculate 
ike ashen to which the lower correlations were produced 
A vic eror in the criteria of ratings on both present 
E | ae А performance. Since personnel specialists seem 
ы cen ed with the present methods for the selection of 
cpm p from among technicians, however, perhaps even 
igh rrelations indicate that an improvement in selection 
ght result from the use of these tests. 


VIII. Appendix 


үү. Information for Rating Employees 
The as id ask for your cooperation in preparing these ratings. 
indicated’ sheets have been prepared for those employees you have 

oti „you wish to rate. You, your agency, and the Civil Service 
compl Ission have spent much time with the testing program recently 
indic eted, but much of this effort will be wasted unless your ratings 

аге your careful and critical evaluation of these employees. 

hese ratings will not be used for any purpose except to determine 


| пош between test score and ratings. All the ratings will 
oe e by designating employees by their code numbers. he 
names tabulating the ratings and the test scores will not know the 
Statisti of the individuals associated with these papers. The entire 
5 EE M process of analyzing the results will be done on the basis 
е numbers. adio 
p p. any reason you feel that you cannot rate an individual, 
thats o not do so. To repeat, this whole study of administrative 
sible now depends on your willingness to furnish the best ratings pos- 
- Thank you for your cooperation. 


Specific Instructions 
E Rating Method I—This rating method asks you to rate the 
deb Оуее in comparison with the standards appropriate for his posi 
strar If his position does not give him an opportunity to demon- 
ate his ability on any of these factors, then rate him on what you 


performance, a rating of “3” 


the T 


eleve is his potential ability. 
indie rating of “5” indicates perfect ance, 4 
ates satisfactory performanc ating of “1” indicates unsatis- 
4Ctory performance. Ratings of “2” and *4? indicate intermediate 
Brees of performance. Keep in mind the requirements of the posi- 
end occupied by the employee. i 
plo, me Method TI—This method requir ) 
Oyees by pairs, taking into consideration the requirements о 


es the comparison of em- 
f their 


312 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


jobs. This comparison is a natural one in the sense that a supervisor 
usually thinks of the performance of employees in relation to that of 
other employees. Your preference should be based on the over-all 
Job performance rather than on any one part of it. In comparing 
persons not now in administrative positions, the comparison shoul 
be on the basis of their potential administrative ability and not on 
their performance in technical positions. 


Graphic Rating—Method I . f 
I. Place an X in the appropriate place on the line to the right о 
each factor. A rating of / indicates that the employee does not mea 
the standard of his position on that factor; a rating of 3 indicates bus 
he does meet the requirements on that factor; a rating of 5 omg 
that he is outstanding on that factor. Please guard against t ү 
tendency to rate an outstanding employee as 5 on every factor, sinc 
even outstanding employees are rarely superior in every respect. 
A. Ability to plan an administrative pro- 


ram or project. ШЕ ИИ ИИ Een 
8 proj Dur * 3 
B. Ability to get a program started, to 
budget, and to coordinate the work of 
his unit with others. L a ЕЕ ЕВЕ ЕЕ 
ig. $5 
C. Extent of technical knowledge. See 
Lio. oq # 
D. Judgment on technical problems. -—— P ant 
i 2 $ 4 
- Personal relationships with his sub- 
ordinates. — — ЖИЕ ma 
т?з 4 
F. Personal relationships with other | 
government officials or the public. LLL JL egt 
га 3 4 
G. What is your over-all evaluation of | | 
this employee’s performance? -———— СТ 


y 2 3 4 
Rater se 


Paired Comparison Rating—Method II - ated you 
This is a paired comparison of employees you have indicate the 
can rate on their performance. Indicate by underlining one 9 has 
two code numbers in each pair of numbers which employee ved 
demonstrated over-all superior performance in his job as comp are 
with the other employee. Since in many cases the two employee into 
in different grades, you should take this difference in gra eS от“ 
account in considering the performance of the employees. , pde 
paring employees now in technical rather than administrative istra” 
you should compare them on the basis of their potential айпи 
tive success, rather than on the basis of their present performan 


Rater 


RATING OF TRAINING AND EXPERIENCE IN 
PUBLIC PERSONNEL SELECTION: 


CHARLES I. MOSIER 
Social Security Board 


. Tur rating of experience, including training,” like the use of 
Written and oral examinations, is essentially a problem of pre- 
‘ction. Experience is important, not for its own sake but as a 
asis for predicting success on a particular job. This is true 
whether the rating is on an all-or-none basis (as it is in the 
Application of minimum qualifications standards) or is designed 
о пеші іп а rank order ranging from those candidates pre- 
Püvely most competent to those whose competence is 
assumed to be questionable. 
ы анне minimum qualifications we are, in effect, 
< S that applicants whose experience, academic and other- 
Lie 18 less than the prescribed standard are such poor risks 
en can predict they will be unsuccessful, whereas we can 
Steen Predict that those who do possess the requisite 
lence will succeed on the job. 
к hen we assign quantitative scores to particular patterns 
SXperience we are saying that those people with higher scores 
и More likely to be successful than those with lower scores. 
th 'S quantitative rating of experience presupposes that among 
i * candidates for employment there are differences in the pat- 
Erns of experience presented. Moreover, these differences in 
Pattern are assumed to be a basis for predicting job success. 
ere the qualifications are high and the salary is low, only 
2-38, for apie eprined throug the courtesy of Ple Compas, ЭКП. (1946), 


of the Ог which it was prepare: 


Merican Association of Social Workers. 
© Opinions expressed in this paper are those of the author and do not neces- 


sarily ose of 
тергеѕепр ial vi he Social Security Board. 
2 the official views of the ` t Р 
Чопар „170Ч2ћоце this discussion the term experience is used to include both educa- 
“Xperience and job employment. 
313 


314 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


persons who barely meet the minimum standards will be at- 
tracted to the job. In this case no purpose is served by rating 
the experience; predicted job success would be the same for all 
candidates and the rating would make no contribution to the 
selection process. Similarly, where the job is of such a nature 
that success cannot be predicted from patterns of previous ex- 
perience—within the range offered by the candidates—there 15 
no point in quantitative rating. An example of such a job 15 
that of messenger. 

The qualification, *within the range of potential candi- 
dates," is important and should not be overlooked. Among all 
conceivable persons differences in experience may be a basis for 
predicting job success, but when the range is narrowed to that 
group applying for the examination, these differences may be- 
come so slight as to afford no reliable basis for designating one 
applicant as a better employment risk than another. 


Prediction Presupposes Accurate Facts 


Prediction always proceeds from facts known at the time 


of prediction to an estimate of future behavior. It presupposes 
the existence of accurate facts, accurately known and unam- 
biguous in meaning. The nature of the data on which ratings 
of experience are based places, therefore, very definite limits ОЛ 
the accuracy with which the most refined technique can prec! 
success. The fact that an individual has had four years of col- 
lege training can, of course, be accurately known. Whether 38 
meaning is unambiguous is open to some question. That mean" 
ing will depend in part on which college, on what grades welt 
made and on what courses were studied. Similarly, the fact 
that an individual has had four years of experience in socia 
work is a fact that can be determined with some degree of асси“ 
racy, although there may be differences in interpretations as 
whether a particular job is or is not in the social work fie y 
Its meaning, however, insofar as the prediction of job ренот 
ance is concerned, depends not so much оп the number of year | 
of experience as on the nature of the duties and still тоге zi 
what was learned during those years of experience. Even er 

the number of years of experience is being rated, the evaluati© 


|| 


v- 


PUBLIC PERSONNEL SELECTION 315 


is limited by the way in which the applicant supplies the infor- 
mation. Many a candidate has received a high score, not be- 
cause his experience was good, but because he described it well; 
and candidates with good experience have been rejected because 
their descriptions of it were poorly stated in relation to a par- 
ticular job, 


Relationship between Facts and Possible Job Success 


The second necessary condition for prediction is that there 
€ a relationship between the known facts and future job suc- 
cess. It is possible to determine an individual's height with as 
high a degree of accuracy as desired; there is, moreover, little 
Question as to its interpretation. However, height is not used 
for Predicting success in most jobs because there is no reason 
to believe that there is any relation whatever between this 
Physical characteristic and the individual’s satisfactory per- 
formance of his duties. 
The determination of the relationship between particular 
eXperience and job success is a technical problem of extreme dif- 
culty, and one where the limitations of the basic data make us 
Westion the value of too great refinement in technique. A 
Urther factor complicating the problem is the requirement that 
Dot only must experience predict job success, but, to be useful 
in Selection, it must predict aspects not already measured by 
Other more reliable measures. 
reoccupation with the number of years of training and of 
*Xperience should not blind us to the fact that we are not inter- 
Ssted in training or in work experience as such; our interest is 
Tather in the knowledges, skills, and abilities which have been 
3cquired or demonstrated through this training and experience. 
€ are not interested in the fact that an individual was in resi- 
ence on a particular college campus for a particular number 
of Months. So was the janitor! Rather, we are concerned with 
the question: “Has an individual who has pursued a certain 
Course of study or held a certain job thereby acquired knowl- 
dges and skills which an individual without such training is 
much less likely to possess?” 


316 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Experience May Afford a Demonstration of Knowledge 


Experience may not only indicate in a general way the 


knowledges and skills which have been acquired during the 
period of training; it may also afford a demonstration of the 
existence of knowledges, skills or abilities, however they have 
been acquired. On a statistical basis, one can predict a higher 
level of general intelligence among college graduates than 
among high-school graduates. This is not to imply that the 
increased intelligence is acquired through college experience, 
but rather that the selective effect of the four years of college 
training has tended to eliminate those unable to demonstraté 
at least a minimum level of ability. Work experience may also 
assume importance in the prediction of success, not because 0 
knowledges acquired, but because the satisfactory performance 
of a particular job may show that the individual possesses cer 
tain skills or abilities. 

Because we do believe there is a relationship between suc 
cess in certain types of jobs and the knowledges acquired pi 
demonstrated in certain types of training and experience, there 
is a strong tendency to generalize this to the unwarranted con” 
clusion that experience is significant in and of itself. Any care 
ful consideration of the problem of rating experience must 
scrupulously avoid this error and concentrate rather ОП the 
abilities acquired or demonstrated by the experience and on thé 
relationship between those abilities and future job perform" 
ance. This latter relationship cannot validly be taken or faith, 
although in practice it often is. The shift from experience E 
such to the underlying knowledges and abilities reopens M 
question: “Is the indirect evidence afforded by a recor’ у, 
experience the best way of measuring those underlying know 
edges and abilities?” 


The Inductive and the Deductive Method | 


‘ple. 
In the problem of prediction two approaches are possib 


The first of these we may consider as the purely inductive a | 
proach. The records of successful and unsuccessful empl et 
(including among the unsuccessful employees those и 
| 


probability of success was so slight that they never 866 


PUBLIC PERSONNEL SELECTION 317 


employment) are analyzed statistically to determine which pat- 
terns of experience characterize the successful employee and 
differentiate him from the employee who is unsuccessful. The 
inductive method, although it has much to recommend it, is 
essentially wasteful; many types of relationships among train- 
Ing, experience and success will be investigated even though 
there is little probability that the investigation will prove 
fruitful. 

A second and preferable approach is the formulation of care- 
ful hypotheses as to the expected relationship between experi- 
ence and success and verification of these hypotheses by actual 
observation. Since most schemes for the rating of training and 
experience have never gone beyond the state of formulating 
4 priori hypotheses, the necessity of verification cannot be over- 
emphasized. However careful the a priori judgment may be, 

Owever competent the consultant whose judgment is used to 
Set tentative values or patterns of training and experience, full 
reliance on the rating of training and experience as a valid 
Means of predicting success can come only after each hypothesis 
has been carefully tested against the actual facts of success or 
failure on the job. 


Procedures for Evaluation 


Under the second approach, two procedures are available 

9r evaluating the experience of an individual applicant for a 
Particular position. The first of these may be characterized 
as impressionistic. A reviewer, using certain written standards 
as guides, reads over the total pattern of training and experi- 
°псе and in the light of those standards assigns a quantitative 
evaluation based upon his over-all impression of the value of 
the training and experience. However carefully the standards 
May be formulated, this procedure in the final analysis rests 
“Pon the subjective judgment of the one or two reviewers. 
Ven with the most competent reviewers, it is highly unlikely 
that they will possess the necessary degree of clairvoyance to 
make Predictions which are significantly better than guesses. 
he alternative procedure involves first, the evaluation of 

© Various aspects of experience, that is, the kind and amount 


of training, the various jobs held, and second, the combination 
of these values for the applicant’s particular pattern into a com- 
posite best prediction of success. If this prediction is to be a 
best prediction, the weight assigned each type of experience 
should result from statistical investigation of the actual proba- 
bilities of success demonstrated by persons offering that pat 
ticular qualification. When such statistical weights are lack- 
ing, it is necessary to fall back upon weights assigned as the 
result of the consensus of competent judgment. Moreover, any 
combination of individual weights to yield a composite evalu- 
ation must, if it is to be effective, take into account the inter- 
relationships among the various types of experience and suc 
cess, as well as the value of each type of education or experie 
No matter how good the consensus of opinion may be, 


weights should be subjected to later verification against Es 
the 


nce. 
the 


actualfacts. The weights determined by judgments are int 
nature of hypotheses; they should be considered as tentative 


. 2 i е 
and merely as the best guesses which are available ın th 
absence of information. 


Hypotheses Which Have Proved Valuable 


| In the assignment of values to particular patterns 0 
ing and experience there are certain hypotheses which ‘appe?! 
fruitful and have been extensively followed over the past Sig | 
years and more by state merit systems and civil service agencies | 
in consultation with subject-matter experts. One assumptio? 
which deserves mention only because it has occasionally ро 
used, although it is far too gross for any adequate results; A 
that the probability of success increases with the mere 288", | 
gate length of experience (including educational experience a 
This procedure completely ignores such questions as the pei 
nency of the education and the relatedness of the experien? 
to the job in question. : 5 
Where minimum educational qualifications are set, 2? oe 
pothesis implicit in their use is: there is a minimum of educ? 
tion which is requisite before any amount of experience . b 
value. This may be illustrated by the position of statisticl? | 
Unless the individual has had basic instruction in statisti? 


318 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 
[ 
f train- 


PUBLIC PERSONNEL SELECTION 319 


procedures and statistical theory, it may be argued that no 
amount of experience in adding, subtracting, multiplying and 
dividing figures as a statistical clerk will produce the skills 
necessary in a statistician. This minimum of education which 
is required as a base upon which experience is to build may 
vary from grammar school for some jobs through graduate 
education for others. However, each time a minimum require- 
ment of education is imposed, it implies that such a degree of 
education is necessary if experience is to have value; it implies, 

Moreover, that no amount of experience without such education 
can result in the necessary knowledges, skills, and abilities. As 
far as the writer knows, the assumption has never been tested 
against fact. 

. The reaction of certain interested groups toward the prac- 
tice of setting minimum educational requirements should not 
obscure the fact that for most jobs minimum requirements of 
eXperience are also set. The hypothesis involved is substan- 
tially the same as that with respect to education, namely, that 
there is a minimum of experience which is necessary to a 
Teasonable assurance of successful performance and for which 
No amount of education may be substituted. 

This hypothesis appears most plausible for the higher classes 
n апу occupational series. The senior clerk, the principal 
accountant, the advanced statistician, or the principal case 
Work supervisor are those where it is reasonable to say that a 
Minimum amount of experience in an actual job situation is an 
essential prerequisite and that the possession of a Ph.D. with- 
Out such actual experience would not, in all probability, lead to 
adequate performance on the job. 
li À third hypothesis usually accepted is that between these 

"its of minimum education and minimum work. experience, 
education and experience may each be considered the equiva- 
ijs of the other. In speaking of experience and education 
hes equivalent, we speak, of pee! education and 

at the same level of pertinence. At 1s not proposed, 

я ‚еы that experience as an accountant is the equivalent of 
n ме social work training for a social work position; nor 
е other hand is it proposed that graduate social work 


320 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


training is the equivalent of experience as an accountant when 
an accounting position is in question. It is proposed, however, 
as an hypothesis to be followed until it can be verified or re- 
jected, that the most closely related experience is the equivalent 
of the most closely related education. It would be possible, of 
course, to give greater weight to the most pertinent experience 
than to the most pertinent education. When, however, the 
range between the minimum required education and the mini- 
mum required experience is considered, such differential weight- 
ing leads to both technical and administrative difficulties. 
Another hypothesis, sometimes ignored in the establishment 
of systems for the rating of training and experience, is that there 
is a maximum of experience beyond which no increase in com 
petence is either acquired or demonstrated. Let us assume that 
an individual's experience has all been as closely related to the 
Job for which he applied as possible. During his first year 0 
successful employment he will have learned a great deal; he W! 
i met new situations and have learned the methods of deal- 
ing with them; he will have acquired skills which he did not 
formerly possess; and by holding the job over the period of a 
year he has demonstrated definite abilities requisite to the JO 
for which he is applying. In his second year, however, the nu 
ber of new problems as compared with the situations previously 
met becomes proportionately smaller and this decrease in the 
skills acquired continues until at some stage—after 2, 5; 10 or 
20 years—further experience neither results in new knowledge? 
or skill nor provides any additional demonstration of his ability: 
| In fact it may be argued, and with some cogency, that Фа" 
Is a certain point beyond which continued experience indicate? 
not an increased, but a decreased probability of success. If p 
WC being literal minded, he would give negative credit 0 
additional experience beyond this critical point. An ехатр Ў 
will serve to demonstrate this point. Let us consider applicant 
for a position as principal clerk. The individual who, at the €T 
of 5 years of work has served 2 years as senior clerk is, it мна? 5 
seem, a better risk than the individual who has spent 20 yen" 
as senior clerk without advancing further. Although the -— 
ple has been chosen from the clerical field, the principle is apP Ў 
cable to technical jobs as well. The difference between the t 


ад: 


PUBLIC PERSONNEL SELECTION 321 


fields lies in the number of years of experience necessary before 
the point of detrimental return is reached. We are not pro- 
posing that in actual practice applicants receive negative credit 
for experience. Such a proposal would be wholly unacceptable 
to the general public and would result in public reaction so un- 
favorable as to offset any possible benefits that might be 
derived. The suggestion is presented merely to strengthen the 
plausibility of the hypothesis of an upper limit. 

In theory, at least, the corresponding hypothesis holds for 
education, namely, that there is a maximum of education be- 
yond which no increase in job performance is likely to result . 
and that there may be a limit beyond which any increased 
education actually indicates reduced probability of success on 
а particular type of job. In practice, these limits are not 
Usually reached in the patterns of education actually offered by 
Candidates for a particular position. They may, however, actu- 
ally be reached. In a much depressed labor market college 
Staduates may be available as junior key operators. The fact 
@ college education would in all probability be detrimental to 
Successful adjustment in card punching. For most professional 
Jobs, however, the theoretical maximum of education need con- 
beg us only in the case of those few individuals who collect 
ie n degrees very much as an Indian collects scalps. While 
ing Оте cases this collection of degrees reveals a love of learn- 
vedi. others it is indicative of an unwillingness to face the 
nt of “full-time paid employment. Those who remain 
fe in the cloistered halls far beyond the normal maximum are 

* always correspondingly good employment risks. Of the 
ca otheses thus far presented, however, that of excessive edu- 

‘on is the least useful in the practical examining situation. 
rm ee тс 
Predict; irectly pertinent to t s of J 1 

Ive of success than experience which is less pertinent. 

Sorollary of this assumption is that some experience is so 

olly unrelated as to have no predictive value whatever. 


Assigning Values to Experience 
y This easy generalization must be given concrete expression 
* assignment of particular values to particular experience 


322 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


as related to particular classes of jobs. It is in assigning these 
values that the highest degree of competence on the part of 
both the personnel technician and the subject-matter expert is 
called for. It is far more desirable to combine the judgments 
of several competent persons than to place reliance on the judg- 
ment of a single individual, however competent he may be. As 
we have indicated, the ideal method is to determine the weights 
for each type of job from an actual analysis of the probability 
of success or failure among those individuals possessing that 
particular level and kind of experience. "m. 

When the values are assigned on the basis of judgment, 1t 15 
assumed that there would be a marked relationship betwee? 
judged pertinence and the probability of success. This assump- 
tion, of course, requires verification. 

‚ А number of techniques have been proposed for the system” 
atic assignment of values to varying types of experience. The 
one which seems to have the most to recommend it is one use 
by several of the state merit systems serving social security 
agencies. In this procedure all of the applications for a particu” 
lar Position, for example, visitor, are studied and each type 0 
experience offered is copied from the application on a separat? 
5x8 card. The resulting deck of cards shows all of the уре 
of experience actually offered by applicants for the visitor i 
position. The cards are then sorted into a number of piles, um 
the most valuable experience in the highest pile, the least va nE 
able experience in the lowest. The sorting is done indepe?” 
dently by a number of persons who are presumed to be compe 
tent to judge the relative value of the several types of expen 
ence in predicting success as a welfare visitor. The avet@8° a 
the pile number in which a card was placed by the seveni 
Judges is taken as the value to be assigned that type of expe 
ence. This method has the advantage of giving вуеш 
consideration to the types of experience actually offered rat " 
than of those which might be offered but were not. Moreov K 
it provides for a systematic determination of the consensus. 
a group of judges. Its value depends on the adequacy "pe 
which the types of experience were actually described ЬУ to 
candidates and on the extent to which the judges were 2 


ee __ 


PUBLIC PERSONNEL SELECTION 323 


anticipate the probable success or failure of persons presenting 
each type of experience. Once the scale has been developed, 
with values assigned each type of experience, the addition of 
the new types of experience offered in subsequent examination 
Programs becomes a fairly simple matter. The technique has 
the disadvantage of being cumbersome and time-consuming. 
However, the methods which consume less time have results 
which are correspondingly less valid. 
{ Still another working hypothesis is that experience which 
1s progressive is more valuable than the same amount of experi- 
ence on the same job or in jobs of decreasing responsibility. 
hus, two people may each have three years of experience—one 
as Junior visitor, one as senior visitor and one as case work 
Supervisor, The one who began as visitor and worked up to 
case work supervisor is a far better risk than the individual 
whose initial employment was as case work supervisor and 
Whose subsequent positions have been progressively less re- 
SPonsible, Any system of rating should reflect a difference 
€tween the two. 
It is generally accepted that recent experience is more valu- 
P le than remote experience. This has the corollary that ex- 
Perience gained more than a certain number of years ago, with- 
Qut Intervening, pertinent experience, is of no value? This 
ete gains its force from the fact that individuals forget 
n 5 unless they continue to use them. A person whose latest 
Perience in the field of social work was 10 or 20 years ago no 
cy Possesses the skills which he had at the time that experi- 

* was fresh in his memory. Moreover, in certain fields 
Practices are changing so that the individual whose latest ex- 
Perlence in the field was gained 20 years ago has probably not 
waned those skills and knowledges which are today con- 

Sred important. 

t might also be proposed as an hypothesis that education 
ОША be credited on the basis of its recency. Application of 
Ба робна to actual rating ж however, so AS 
~ <18 the older applicants whose education was gained a 


sh 


3 

peril the intervening experience is in the same field, then the more remote ex- 

Xperience Eht receive no credit under the hypothesis of a maximum amount of 
* beyond which there is no presumed increase in competence. 


324 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


number of years ago as to constitute highly questionable prac- 
tice. Moreover, the character of the school curriculum does not 
change with such rapidity as to warrant the conclusion that a 
bachelor’s degree gained in 1910 is less valuable than a bache- 
lor’s degree gained in 1940. In the area of graduate work m 
certain specialties the situation may, of course, be differents 
although the public relations problem involved in penalizing or 
disqualifying candidates beyond a certain age still remains. 
There is general agreement that more responsible experience 
is more predictive of success than less responsible experience 
this assumes, of course, that the previous job is not significantly 
more responsible than the position for which application 18 
made. For a routine operating job which carries little ОГ ng 
responsibility, an individual who has carried broad adminis 
trative responsibility for a large program may not be à goo 
risk. Even though such an individual might possess the nee 
sary skills and abilities (this is not necessarily true), hi$ dis- 
satisfaction with the routine character of the new duties WOU 
probably result in an unsatisfactory job performance. 


Rating Experience for Supervisory and 
Administrative Positions 
In rating experience for the entering level of supervisory of 
administrative positions, special problems are raised. The ро, 
son being considered for the lowest level of case work superv!s? 
cannot normally be expected to present supervisory experi 
If this requirement is imposed, two questions are immediate 
raised: (a) Where is he going to get such experience? and 
If he has had such experience in supervisory positions, her 
he interested in another job at the same level? If, on the ot 
hand, no such requirement is imposed, we face the pT0* . р 
arising from the fact that experience in а nonsuperviso 
often gives no indication of potentialities as a SUP 
There should be further study of the types of nonsu i 
experience which may have predictive value for supe 
positions. 
Evaluating Quality of Experience 
The problem of evaluating the quality of experience 
tinguished from its pertinency is inevitably raised in aP 


as 018 


ye 


PUBLIC PERSONNEL SELECTION 325 


cussion of the rating of experience. Quality of experience or 
education may refer to either of two different aspects. The first 
is the reputation of the school or the agency in which the ex- 
Perience was gained. Certain schools are undoubtedly better 
equipped to provide the knowledges, skills, and abilities and 
Maintain much higher standards of admission and graduation 
than others. Graduates of such schools are presumptively 
better qualified than graduates of other schools less well 
quipped. The same considerations apply, of course, to experi- 
a in particular agencies. Certain agencies with a reputation 
т excellent work, strong supervision, and a planned program 

9 Staff development are probably far more likely to provide 
their staff members with the skills and knowledges necessary 
for the performance of closely related jobs than are agencies 
ти reputation in this area is not so high. The second aspect 
af E quality of experience relates not to the quality of the 
E or agency, but to the quality of the individual s perform- 

as evidenced by school grades or by service ratings. 

К ен» are inevitably raised as to the desirability of in- 
i £ either or both of these factors in any evaluation. There 
An m agreement that these hypotheses are most reasonable. 

of hi sonable ra suppose that experience in a school or agency 
in cs о к is more predictive of success than experience 
the я lose quality of work is less well regarded. Similarly, 
Biene Ividual whose performance in either type of school or 
vidual Was exceptionally good is a better risk than the indi- 
table whose performance was mediocre or Just barely accep- 
obsta i On the other hand, two apparently insurmountable 
Propri es present themselves in actual practice. The first is the 
Pied of the merit agency (or any other evaluating body) 
Yond det to rank the quality of educational institutions be- 
os the separation into accredited and nonaccredited institu- 

Even through their own accrediting associations. The 

cies nsiderations apply to the ranking of employing agen- 
numb.” canines of welfare. Moreover, in most fields the 
© assificario employers is so great = any т pas or 
other ма would be administrative y impossible even if no 

ems were involved. 


326 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The other and still more serious consideration is that there 
is no objective method of comparing the quality of indivi 
performance in one educational institution or employment situ- 
ation with that of another, granted that there are differences 
in quality. Not only do universities differ in their grading 
standards, but within a single university different administra- 
tive units differ so that the individual who graduates from 
school Y with a straight A average may or may not be superior 
in his performance to the individual who graduated from an- 
other department of the same university with an average 0" 
The problems involved in equating service ratings given 
different employers using different systems (in many cases 
using no system whatever) appear insoluble. 

If individuals are to be compared as to quality 
ance in their previous experience or educational history, ы; 
same standard of comparison must be applied to all individuals 
and we must look beyond the experience record for our com 
parison. We are not, however, left without a measure of quality 
of experience. The primary interest is not in experience 
education per se but rather in the knowledges and skills ©. 
quired; it is reasonable to suppose that those individuals W one 
performance was superior will have acquired more know'e for 
and greater skill. If they have not, then there is no basis 
the differential weighting. If they have, the difference will 


of perform" 


at least for certain types of test, the rank order of the WI 
examination is very close to class rank within a single 
When several schools from the same school system are n^... 5. 
however, the relationship becomes much less and may aoe 
pear entirely because of the lack of comparability ш 
schools. in 
à The only practicable method of rating the quality of the ‘of 
dividual’s past performance is to investigate the evident dg? 
that past performance as it is expressed in present knowl? er 
and skill through the use of the written examination, oU 
formance test and the oral interview. If the quality of prev ple 
experience is not manifest in terms of presently demons" it 
knowledge or skill, then it is of little or no consequenc? i 


dual , 


PUBLIC PERSONNEL SELECTION 327 


prediction of future success. In any event, there appears to be 
no other administratively practicable method of taking quality 
of performance or quality of the agency or institution into 
account in the systematic selection process. 


Rating of Experience Contrasted with Other 
Prediction Methods 


Ton considerations just raised provide an appropriate intro- 
E. E another aspect of the rating of training and experi- 
ats E are concerned, as we have said before, not with the 
Bion a training and experience as such, but with the predic- 
Meme uture job success. It is a truism ın prediction that each 
nt in the prediction formula—the written examination, 
a interview, the performance test, and the evaluation of 
panei, and experience—should each make an independent 
bla | ution to the total prediction. The oral interview is valu- 
"c" nsofar as it measures aspects of the individual's perform- 
us not already better measured by the other components of 
E selection method. The same consideration applies with 
qual force to the rating of experience. If that rating does no 
i than confirm the prediction already available through the 
fe ES of the written examination and the oral interview, it 
the i e no contribution which would justify its inclusion in 
Aen ET process. The question would, of course, be raised 
coni ether our prediction should be based upon the written 
ation or upon the rating of training and experience. 
a ү каш is made in terms of the reliability of the estimate 
erms of administrative costs. 
9 one has questioned that the written examination is a far 
Assn аш than is the best possible prediction based 
d arn rating of training and experience. There is evidence, 
rather ki that the rating of training and experience 18 more 
Same qu | less expensive than the written examination, if the 
OC a ity of prediction is to be achieved. All of which leads 
should conclusion that the rating of training and experience 
edge, cone arranged that it measures, not all aspects of knowl- 
: ividua? and ability, but rather only those aspects of the 
Ado s presumptive job competence not already measured 
omically and more reliably by the other components 


more 


328 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


of the total selection process. It is primarily useful as an 
auxiliary selection tool. 


Other Problems in Rating Experience 


The discussion thus far has been in terms of general princi- 
ples and general hypotheses which might be given specific appli- 
cation to the rating of training and experience for a particular 
job. It should be emphasized that acceptance of any or all of 
these hypotheses as a working basis is merely the beginning 9 
our task of rating. There remain the tasks of deciding for each 
particular type of work precisely what is the minimum educa- 
tion and experience required and of determining the basis = 
which education may be substituted for experience. аа 
most pertinent education worth the same as, ог more, OF ү 
than the most pertinent type of experience? For this particula 
type of job, what is the maximum education or the maximum 
of experience beyond which no increase is likely to result ЇЇ pi 
increased probability of success? Considering all of the tyP 
of experience which might be offered by prospective eani 
for the particular job in a particular agency, how should t í 
types of experience be classified as to degree of pertinere, 
How many degrees should there be? What credits should € 
given for each level of pertinence? How shall the credits 
adjusted to assure that progressive experience receives е 
credit than experience which was not progressive? How а фе 
must experience have been to be credited at all? How shall a 
weight to be.assigned each individual year of experience 
particular type of work be adjusted so that the most ptt 
experience receives the greatest credit? How shall the erti 
recent, but less pertinent experience be related to more uch 
nent, but less recent experience? What shall be done v lun” 
special problems as the question of crediting part-time от “ош 
teer experience or education gained outside the norma = col 
of academic institutions in correspondence schools, busines edit 
leges or schools which are not accredited? Shall we grant “ne! 
for the possession of a college degree in addition to that ё 
for years of training? (The individual who attende jailed 
for 4 years and did not get a bachelor’s degree may pu 
to get the degree because his work was inferior in qua 


| 
j 


PUBLIC PERSONNEL SELECTION 329 


because his inability to swim the length of the pool prevented 
his passing Physical Education I.) 

These and a number of other specific and troublesome ques- 
tions must be answered before the evaluation of training and 
experience for a particular type of job in a particular agency 
can proceed—even on the basis of the working hypotheses 
which have been outlined above. After they have been 
answered and experience evaluated on the basis of these hy- 
Potheses, there remains a possibility that little or no indepen- 
dent contribution to the prediction of job success is made by 
Such evaluation. The hypotheses, however reasonable they 
May appear, may very well not be substantiated by actual 
acts. Almost certainly the weights assigned to varying levels 
of Pertinency by a judgmental process are not the weights 
Which would result in the most effective prediction and very 
Conceivably might result in predictions which run contrary to 
the facts. For example, if we assume that because pertinent 
eXperience is good, more of the same experience is better, we 
Might continue crediting experience so that the individual with 
the greatest number of years of experience will Teceive the 
Steatest credit. But, on the other hand, an individual whose 
Tate of promotion in his professional field has been extremely 
rapid so that he is eligible for consideration for an advanced 
Job with a small number of years of experience behind him is a 
*tter risk than the individual whose rate of promotion was 
Such that it took him 20 years to become eligible for the same 
Position. The plausible hypothesis of the more experience the 
Seater the credit leads us to an improbable result. How many 
9: the hypotheses described are equally naive cannot be said 
until they have been tested in the light of factual information. 
n summary, experience is of value not in itself but as evi- 
e of knowledge and abilities from which to predict success. 
: number of working hypotheses have been examined; each 
a^ though plausible, needs verification. It clearly appears 

at, however refined the rating process, the inadequacies of 
i applicant’s record of experience impose severe limits on the 
“uracy of prediction; careful rating 1s useful primarily as an 


acne deb à 
ЧхШагу selection device rather than as the principal basis of 
Selection. 


dene 


THE DEVELOPMENT OF AN ENGLISH USAGE TEST 
FOR CLERKS, TYPISTS, AND STENOGRAPHERS' 


KENNETH L. BEAN 


Louisiana Department of State Civil Service 


Many different forms of test questions have been devised to 
measure ability to spell, punctuate, use correct grammar, and 
employ the right word in the right context, in other words the 
mechanics of English, which all typists and stenographers and 
re should know. One classical academic form of test 

is field is the straight dictation by the examiner of sen- 
tences which are taken down in longhand. Grading of such 
Papers is laborious.. Printed sentences in which errors are to 
© corrected in pencil represent some improvement in method 
80 far as scoring is concerned, but the location and counting of 
night and wrong responses is still rather tedious. Multiple- 
Choice items which isolate within a sentence some particular 
Problem of punctuation, spelling, or grammar are easily scored, 
Евер usually are not as difficult as sentences in which the 
5 are not made obvious by selecting some word or phrase 
ae three, four, or five choices. Some civil service examina- 
hikes are now used in which sentences are given with four 
ers scattered along above each, with one number directly 


ov. j 
er the error as shown in Sample A below. 


1 2 5) 
Sample A: The man who I wanted to see is occupied this 


afternoon, 
th; In the above illustration as well as in most other items in 
Is А . 3 
o S Form which we have observed, the error 1s again made too 
10 Ч m - а 
E by its position directly under a number. The incorrect 
4 Eee? writer wishes to acknowledge the cooperation of Anna Lee Brown, Norman 
ent onn and Donnell Read of the Examining Division of the Louisiana Depart- 
tate Civil Service who assisted in this research. 
331 


332 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


numbers are seldom above anything that might wrongly be 
considered an error. We therefore contend that, although the 
material in most of these tests contains excellent illustrations 
of the principles which clerks, typists, and stenographers need 
to know, the sentences as presented are not those used in the 
typical situation encountered in an office where the writing of 
acceptable reports or letters or where the correcting of copy 1$ 
required. 2 

In order to measure a certain amount of proof-reading skill 
which we believe to be essential for jobs in these classes, and in 
order to increase the difficulty of recognizing errors while at 
the same time retaining the ease of scoring characteristic of the 
multiple-choice form, we originated a new manner of present" 
tion of English usage material. An elaborate system of symbols 
designating grammar as G, usage as U, etc., was felt to be 
superfluous, because it would involve following complex direc- 
tions, thus introducing an additional difficulty factor which we 
Were not attempting to measure in this section of the test. Also 
it seemed that the duties of clerical positions would not require 
the candidate to define the type of error in this way- pus 
though it might be ideal for him to be able to recognize га 
principle involved he needs only to sense where something » 
wrong in order to identify the mistake by number. In ош 
system of presentation, each sentence is divided into four ae 
tions by means of diagonal lines. Each section is пиштрего 
but the number does not necessarily fall directly above ше 
word, phrase, or punctuation mark that should be correcte", 
Some of the sentences are entirely right, and are to be answer 
“R” instead of by a number. The answer to each item 
always either a number or else R. Sample B shows the for 
of presentation of our material. 


= : ied/ 
Mu B: The man who/ I wanted to see/ is occup! 
this afternoon. 
of the 


In this illustration there are no clues as to the location Most 
error, and it is therefore more difficult than Sample А: 


tur 
of the sentences are complex enough in vocabulary or struc 


__ __ ний 


LS eee e. 


DEVELOPMENT OF AN ENGLISH USAGE TEST 333 


so that a candidate who is uncertain about any of the accepted 
rules might easily think he had found an error in spelling, 
Punctuation, or word usage in a section that is actually en- 
tirely correct. 
| Тһе Louisiana Department of State Civil Service has been 
8iving tests for entrance-level jobs and for higher grades of 
Positions in this series for nearly three years at intervals vary- 
ing from six weeks to three months. The entrance classes in- 
cluded Clerk I, Typist Clerk I, and Stenographer Clerk I, while 
the higher levels included Clerk II, Typist Clerk II, and Stenog- 
гарһег Clerk II. The same written tests applied to Clerks, 
Ypists, or Stenographers at each of the two levels, the level II 
examinations being the more difficult in content. The level I 
material covered clerical aptitude, following directions, arith- 
ee reasoning and English usage, while the level I included 
ih Same clerical aptitude test, more advanced items in each of 
i other sections named above, and some multiple-choice 
questions on office practices. 
Originally the English Usage test consisted at each level of 
items overlapping somewhat in content. A reduction in 
Un ы to 15 at each level, however, ultimately became necessary 
S nde the objections raised as to the long duration of the 
un Sirene ign Four forms as nearly equivalent as possible 
Mw du DN developed for use in rotation to prevent prac- 
ез cts for candidates repeating а test. Four repetitions 
the im were allowed for each individual, but very few took 
€st that often. 
TN i n analysis was made for 40 items of the 120 in use in 
the x: he results summarized below were found by tabulating 
Sponses of 256 applicants for these positions who took 
9 I and II level tests in 1943. Because of the manpower 
та in during the war, these individuals should not be con- 
tions typical of peacetime candidates for these classes of posi- 
Popul; The test was written with this wartime sample of the 
is Mel in mind, and the difficulty level of many of the items 
ow for normal recruitment conditions. However, since 
in Scores approximated a normal distribution as closely 
&roup of that size would be expected to do at best, we 


334 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


believe that we have in these results a useful indication of the 
relative merit of our sentences. 

The tetrachoric correlations were estimated graphically by 
means of the Nomograph for Item-test Correlation from Per- 
centage of Upper and Lower 50%, Passing the Item, prepared 
by Mosier and McQuitty (5). Interpolation from this chart 
can be made more accurately from the middle ranges of per 
centages than from the extremes. If either percentage of right 
responses was above 95 or below 5, the correlation given means 
nothing more than a rough estimate, and cannot be considered 
accurate to the second decimal place. Even in the middle 
ranges, as Mosier has pointed out, we cannot depend too muc 
on figures beyond the first decimal place, since the PE of tetra- 
choric correlations is roughly twice as great as that of the 
product moment r. 

TABLE 1 


Per cent correct Tetrachoic Average 
Upper 128 Lowerl2g Correlatio 
100 84 93 2 
2) m 5 65.5% 
90.5 42.0 7735 : 


The per cent correct distribution was more nearly normal 
for the lower group than for the upper group. On the ey el 
however, the test section was approximately at the right d 
of difficulty for the entire group. The median tetracho? 
correlation is fairly high. Although some attempt was p 
find possible causes for low correlations on a few items, Ww 
little of importance was gained by inspection of these pee 
Items having low correlations covered punctuation, spel ) e 
and grammatical errors of a nature that were not consi hi 
controversial by expert consultants. Difficulty of the item 
not seem to be a factor contributing to low correlations. did 

Although items in the English Usage section of our pe he 
not show quite as high correlations on the whole as п ec 
items in the Following Directions and Office Arithmetic § 
tions, most of them were of sufficient value to be retained: on 
exact criteria seem to have been established and agreed Е 


m «р аала. is se е 


DEVELOPMENT OF AN ENGLISH USAGE TEST 335 


by investigators in the testing field for acceptance or rejection 
of an item on the basis of tetrachoric v. Much depends upon 
the particular purpose for which the given test or test section is 
intended. However, we were faced with the necessity of a de- 
cision with little time for further investigation to determine 
exactly where to draw the dividing line between acceptable and 
unacceptable sentences. Therefore we omitted items having a 
Correlation below .50, of which there were three in this test 
Section, as being too low in discriminating value, and we are 
Considering ultimately dispensing with five other sentences in 
this group having over 90 per cent right answers in this sample 
ОЁ cases, 

Other criteria need to be considered besides item analysis 
data to determine whether a sentence is fit for retention. Ex- 
Perts were selected who were at the time teaching business 
letter writing and related subjects at Louisiana State Univer- 
Sity. These specialists were asked to review all 120 items then 
11 use to determine whether every principle illustrated was 
defensible in terms of modern practice. Several were found, 
Including three in the group under consideration here that 
Were obsolete, Rules, particularly with regard to punctuation, 

ауе undergone some change. Those who went to school before 
1925 Were usually taught that a comma must be used under 
Certain circumstances, while those whose training is more 
recent learned that omission of the comma under some of these 
Same conditions is perfectly acceptable. Where differences of 
9Pinion among present and past authorities were found, we 

avoided any illustration of a point not clearly defensible. 
„Оп the whole the item analysis revealed that the most 
difficult sentences were those involving a choice between who 
and whom, Punctuation ranked second in difficulty among the 
Problems presented by our sentences. Then followed in order 
Word usage, spelling, and capitalization. The above statements 
к not be applied as generalizations, since a very small 
“mple of each type of error could be included in a test section 
9E this length. Probably another set of 40 sentences would 
М ey rationali 
n of errors is large with much varia y 


336 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Although some attempt was made to give easier material to the 
level I applicants than to the level II candidates, both groups 
got about the same proportion of sentences containing each 
type of error. This principle was followed in constructing each 
of the four forms later used in rotation. 

If a test covering the mechanics of English is to be ac- 
curately diagnostic of probable degree of success in the kind of 
writing expected in an office, it should contain more than 4 
items, but there are other factors to measure that are important 
on these jobs. Recruitment conditions do not permit us to 
subject people to endurance tests for jobs that pay less than 
those in industry. Reluctantly we have been forced for the 
sake of public acceptability to reduce the length to 30 items 
Knowledge of the validity of every item thus becomes all the 
more important. 

In 1944 the 120 items then in use were given to 200 college 
freshmen at Louisiana State University. The resulting scores 
were then correlated with the Purdue Placement Test in Be 
glish. The Pearson product moment r found was .71 with 
017. When our test is reduced in length to 40 items, the r 18: ^ 
This result is not surprising when we take into account the fac 
that the Purdue test covers a wider scope of knowledge than 007 
own and has a somewhat different purpose. he 

We also had available for the same group of students — 
scores made on the American Council of Education Psycholog" 
cal Examination. This verbal group test of general intelligent 
was correlated with our 120-item scores on English Usage е 
the product moment r was found to be .65 with PE 02. The 
ducing our own test to 40 items gave a correlation of .59. his 
1943 edition of the American Council examination used WT 
group correlates .64 (PE .02) with the Purdue Placement 
in. English. с 

The reliability of the 120-item test found by the SP 
method on the same group of freshmen was .84 with Р put 
This would be considerably lowered by shortening the test; est 
we are primarily interested in the reliability of our clerical t of 
as a whole rather than any one section of it, since We are 


: э. жа: ж E . " 0868 
using individual sections for diagnostic or prognostic purP 


Һа 


DEVELOPMENT OF AN ENGLISH USAGE TEST 337 


The split-half reliability of the entire written examination for 
the level I is .69 with PE .04 on the basis of 82 cases. For the 
level II written test it was .70 with PE .05 based on 52 cases. 
An objection raised by some individuals to the inclusion of 
an English Usage test as a part of clerical examinations is that 
young applicants just out of school would have the advantage 
over older persons who might be quite rusty on grammar or 
spelling and still be the most efficient workers in the office. If 
ae to use correct English is extensively applied on the job, 
an objection, even if true, would have no validity, since 
We must select those who are qualified in all important respects. 
o Investigate the hypothesis of these objectors, we correlated 
age with scores on the levels I and II English Usage sections. 
he results are shown in Table 2. Age was not normally dis- 


TABLE 2 


Correlation of Score with Age 


Level Variable I, Age m Variable II, Score 5 


N Mean S.D. Mean S.D. т РЕ 


i 82 2196 819 English 92 293 -.06 07 
y 2 2196 819 Whole — 3305 6.45 -07 07 
pn 3$ 2637 1070 English 13.81 457  -.39 08 

54 2637 1070 Whole 48.60 929  -.33 08 


күз in either of these two groups, and particularly at the 
extrem the mean is probably considerably distorted by a few 
tende € cases at the upper end of the distribution. There is a 
"eres for the ages to cluster decidedly within a few step in- 
Ai 5 at the lower end, and this should be taken into account 
*rpreting the data. 

ed older people tended to make lower scores, at least a 
ue oe high negative correlation should be found between 
T age. As will be noted from the table of results, no 
Bor cant correlation exists at the level I, while at the level II 
along. though negative, is moderately low. A few people well 
е ам years сап be found who make low scores, but it would 
ips eresting to know how they rank as office workers. Per- 
-7> 1п the future an adequate system of service ratings may 
Us in carrying this study to a point where the objection may 


338 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


be answered more conclusively. As long as these items appear 
from the data available to be rather closely related to general 
intelligence and to the requirements of the job, we will continue 
to use them. On the whole public reaction to them has not been 
unfavorable, and with further refinement they should constitute 
a valuable part of our clerical test. А 

From Table 2 it is interesting to note that the correlations 
for English Usage do not differ significantly from those for the 
whole written tests for each level. Fifteen English sentences 
were taken by candidates at each level, and apparently 
level II candidates made a better average on the more difficult 
ones than the level I did on the easier ones. 


Summary 
4 „дт1ОШ$8 
In this study we have attempted first to compare vario 
techniques for presenting English usage material and to poss 
out ways in which we believe that our form represents an e 
provement upon many tests now in use. An item analysis ee 
presented which singled out a few sentences that either pei 
to lack validity or were too obvious to present any атс НЕ 
problem. Further eliminations were made as a result of E 
sultation with experts who contended that a few of o m 
lustrations were doubtful as to defensibility in terms of mo " 
business practice. We have shown that our test corel 
moderately high with the Purdue Placement Test in ine, r 
and with a recognized group test of general intelligence: 
entrance level test has been demonstrated to have no sig? 
correlation with age, and at the higher level our test ha 0116“ 
a low negative relationship with age. In each case these entit 
lations for English closely approximated those for the 
clerical test. d some 
Although we recognize that Travers (6) has presente item 
valuable evidence that common methods of estimating and 
validity are subject to wide variation with different groups, 
on different occasions, we maintain that our preliminary " 
analysis given here probably gives us some informati? he 
item validity not obtainable through opinions of experts. " 
next step would seem to be a similar study of the same та 


erial 


= 


DEVELOPMENT OF AN ENGLISH USAGE TEST 339 


on a different group of applicants to determine how results on 
the two groups would correlate. Also we plan to calculate 
tetrachoric correlations on the remaining items in more recent 
forms of the test not yet statistically analyzed. 

Possibly the samples of the population applying for clerical 
Positions with the State may change somewhat in the direction 
of more capable and better trained people. If this happens we 
may raise standards and increase the difficulty level of the 
entire test by eliminating easy items and constructing new 
items of more suitable difficulty. For wartime recruitment 
Purposes most of the present material has served quite well. 


REFERENCES 


Carter, Harold D, “How Reliable Are the Common Measures of 
Difficulty and Validity.of Objective Test Items?” Journal 

2. б _of Psychology, XIII (1942), 31-39. | s 
© Vhesire, L., Saffir, M. and Thurstone, L. І. Computing Diagrams 
for the Tetrachoric Correlation Coefficient. Chicago: Uni- 


versity of Chicago Bookstore, 1933 . 
3. Dunlap, 7. W. Were on the Computation of Tetrachoric Cor- 


4, F relation.” Psychometrika, V (1940), 137-140. 

* Fulcher, J. S. and Zubin, Joseph. “The Item Analyzer, a 
Mechanical Device for Treating the Fourfold Table in Large 
Samples.” Journal of Applied Psychology, XXVI (1942), 

11- Р 
5, Mosier, Gland McQuitty, J. У. “Methods of Item Validation 
and Abacs for Item-test Correlation and Critical Ratio of 
Upper-lower Differences.” Psychometrika, V (1940), 57- 


à Travers, Rob “ lue of Customary Measures 
› tM. “Note оп the Value of Customary 
of Item Validity." Journal of Applied Psychology, XXVI 
(1942), 625-632. 


ARMY GENERAL CLASSIFICATION TEST RESULTS 
FOR AIR FORCES SPECIALISTS 


THOMAS W. HARRELL 
University of Illinois 


oo the remark concerning horse racing at- 
oe ed to the Shah of Persia, some readers know that one 
a ci group will be brighter than others and they do 
oe one is brightest. Other readers may have some 
У in which occupational group is brightest, which is least 
and which are in between, and in what order. 

in d General Classification Test (GCT) results are given 
ilie e 1 for 774,383 men in 209 Army Air Forces (AAF) 
itary Occupational Specialties. The median score was 103.7. 
"m. GCT, the World War II model of Army Alpha, is a 
will шшен test designed to determine to what extent soldiers 
inf ae in training (1). After practice there is a forty- 
С чан time limit. There are four equivalent forms consisting 
Steep] tiple-choice items with four alternatives. The items are 
er graded in difficulty. Three types of items, vocabulary 
Sides of arithmetic reasoning, and block counting occur by 
Sir g of five items in Form A. Form A has 150 items. GCT 
Sear are converted into a standard score scale with an 
med; ted mean of 100 and a standard deviation of 20. The 
ns for men entering the Army up to June 30, 1944, is 
fop ated at 98.7 (2). A system of five grades is used with the 


ol 
owing standard e ranges." 
кес scor ges. 


1 
miliay et standard scores were based on the estimated U. S. male population of 
disons a With a mean of 100 and a standard deviation of 20. For at least two 
op Persion i GCT is not directly comparable with IQ's over its entire range. , The 
Sc aPproxim. one reason which on the Stanford-Binet results in a standard deviation 
5 Ores аге t ately 16 points. The GCT is a language test and consequently some low 
“count b € result primarily of language difficulty. Such difficulty was taken into 
Paper Y the Army's giving additional tests which are beyond the scope of this 


341 


342 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


I 130 and above 


II 110-129 
ПІ 90-109 
ІУ 60-89? 


V 59 and below 


The percentage of AAF Military Occupational Specialties 
whose scores are in each of the five grades is shown in Table 1. | 
Also in Table 1 is shown the median for each specialty. The 
duties of each job are defined in Army Regulations 615-26 
(September 15, 1942). f 

The distribution of AAF enlisted men for the five grades ? 
GCT is seen from Table 1 to be 6.3% in Grade I; 33.2% s 
Grade II; 33.1% in Grade III; 21.8% in Grade IV; 5.6% m 
Grade V. The results of all men entering the Army up to Ue 
30, 1944, gave the following per cents: Grade I, 6.1%; = | 
П, 26.6%; Grade III, 30.5%; Grade IV, 27.3%; Grade '» 
950261039. 

ТАВІЕ 1 adi 
altie* 


Army General Classification Test Results for Military Occupational Speci 
AAF Enlisted Men (White and Colored) 


у Mediant 
136.74 


Title SSN* N I II II IV 


Weather Forecaster ............... 787 726 64 32 4 0 0 
Link Celestial Navigation Trainer 128.4 
ÉFATOR пае E baee rema 970 317 46 0 4 0 07 
Link Celestial Navigation Trainer 0 1280 
CHC uiuo as масала ctu 969 240 45 45 9 ] 0 1262 
Weather Observer 784 4,516 40 51 8 1 0 123.8 . 
Bombsight Mechanic .. 683 1,64 34 52 12 2 o 1230 
Classification Specialist 275 2,882 30 58 П 1 0 1226 
Public Relations Man ... 274 562 28 59 D 1 0 1720 
Radio Repairman VHF .. 91 271 27 58 14 I 0 1214 
Weather Station Chief |. 730 — 12 21 67 12 0 0 114 
аео E MA cen 27 47 612 1055 
Oxygen & Nitrogen Plant Operator .. 719 149 21 66 13 0 0 1206 
Personnel NCO ............. мы. 816 427 22 60 15 3 0 206 
Liaison Pilot-Mechanic ............ 2 1064 20 65 4 ] 0 1206 
adar Repairman, Airborne Search . 955 324 22 58 15 5 0 1204 
Aerial Phototopographer ........... 4 657 24 54 18 4 
* Specification Serial Number. (29 | 
+ Medians were interpolated from percentage in each grade. _ found in f 
t Highest score of grade I assumed to be 157 as this was highest ‘of 
lated sample. nly 


" оп. 
_ 5 Bombardiers as a гше were officers. Since results are shown here dier? | 
enlisted men, the findings here are probably not representative of all bo: 


EE 
2 At first 70 was the minimum score for Grade IV but the distribution 
brought about a change to 60. 


ARMY GENERAL CLASSIFICATION TESTS 


TABLE 1 (Continued) 


343 


Dm Title SSN* N I IH II IV V Mediant 
ink Trainer Instr 
in т Instructor asni incra 658 7,636 23 56 19 2 0 1204 
go Берта, Loeb mt . 1952 376 17 67 15 1 0 1202 
EN an, ing 
" quipment 
Ша n 953 2,572 22 56 21 1 0 1200 
M QUE coque 1208 21 59 18 2 0 1200 
echnicia. Я 
Geodetic Computer 473 19 62 19 0 0 1200 
Mi Ree UI sg cue 144 22 55 19 4 0 1198 
Adp intercept 251 19 61 17 3 0 1198 
Bambari a e МОО 8755 19 60 18 3 0 1196 
Donee. 140 17 64 11 7 1 1196 
Grants 117 20 57 21 2 0 1196 
ашаса опе Chis 813 20 58 19 3 0 1196 
Typtographic Code Operator 1,190 19 58 21 `2 0 1194 
Technical Ш ‘ode Compiler 1,154 22.61 10 2 Oi 1152 
Tavestisator 110г 3844 20 55 21 3 1 1190 
ul ee tes rers ca iare 425 21 52 23 4 0 1188 
ade Clerk en Oe 5 о пев 
да; же зыны ка, ax seated a 5 1186 
Radio Rena Motion. Pietre оо. 04 135 22 49 27 2 0 1186 
Fin n, Aircrait 
Radio Oponent 647 1,729 18 55 24 3 0 1184 
amera Т, Or, AACS . 760 2,028 21 50 28 1 0 1184 
Stenographer i od 903 16 58 23 3 0 1182 
"3 Pc ee чз-еөн 213 985 13 61 24 2 0 1180 
First беге 773 127 17 55 25 3 0 1180 
Entertainment Dj 5585 536 15 57 20 7 1 1178 
Radio Ment Director 1... 442 163 13 54 27 6 O 1178 
Radio renis, ААС, › = вече 778 765 18 52 25 5 0 1178 
OWer "Turrator- Mechanic, AAF .... 756 12,090 15 57 26 2 O 1178 
Radio 19116 & Gunsight Specialist . 678 2934 17 53 27 3 0 1176 
Contro тап, FM Equipment .. 648 167 17 53 23 7 0 1176 
tard Lower Operator ууу: 552 4064 16 54 26 4 0 1174 
QPtician ОЩЕ" оа зан 663 351 13 57 23 5 2 170 
VLA Y Mb temi 365 122 13 56 25 6 0 1168 
edical T, achine Operator . 272 145 13 56 28 3 0 1168 
Radio Mab. Technician .... 858 1,573 16 51 24 8 1 1166 
ire рМесћапіс, CNS ... 759 270 11 57 30 2 0 1164 
Auto рур тап, УНЕ... 950 195 12 55 29 3 1 1164 
Draftsmat Specialist... é 298 17 48 28 7 0 1162 
Wire Chay ites 070 2,530 12 55 27 6 0 1162 
ine Сре» Тр. & Т ж 86 14 51 27 7 1 1160 
Athlete Кыке 752 1,775 12 54 27 7 0 1160 
ШОРО ед ОЕ ees 283 3329 11 56 27 6 0 1160 
Miis КОТ На 1533 134 13 5 26 8 1 1158 
Qhotopra phat Unit Chief 9s 310 9 57 27 7 0 1156 
«ТУШИ ME 152 501 13 51 28 7 1 1154 
QUSE Régie Ө 405 6881 12 52 30 6 O 1154 
power Plang quer Technician .....- 67 877 12 51 27 10 0 1152. 
Pu t Specialist ч,» җа къ чете 55 3 и 0 1152 
INCISO PEN 0 1152 
Medical qn. Telephone 213 12 51 24 13 0 1152 
рріу NCO isine ye wise 354 13 49 30 8 O 115.0 


344 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 1 (Continued) 


Title SSN* N I HE wiv V Mediant 

Draftsman, Topographic 076 428 11 52 30 7 O Цер 
X-Ray Technician ..... s. 264 1,733 9 54 29 7 1 1148 
Photo Lab Technician ............ 945 6496 12 50 31 7 0 1146 
Clerk Non-Typist ................ 055 24675 12 49 30 8 1 1146 
Hydraulic Specialist ............. 528 489 13 48 30 8 1 n 
Radio Operator Mechanic Gunner .. 757 8451 10 51 36 3 0 n 
Instrument Specialist, Airplane .... 686 6,124 9 51 34 5 1 113.8 
Pharmacy Technician ............. 859 340 6 54 34 6 0 113.6 
Teletype Mechanic ... Me 2000 7 53 36 + 0 113.6 
Radio Mechanic, ААР nx aum lm. 
Electrical Specialist ............... 3,626 10 48 36 6 0 rn 
Bandsman .. 5717 11 47 32 9 1 1132 
и 1,780 12 45 29 D 1 132 

3007 7 51 36 5 0 1130 

5,045 9 48 34 9 1126 

DN Rid 18591 8 48 35 8 1 153 

BERE hence 6272 8 48 32 П ] 174 

364 n 44 34 8 3 153 

673 8 48 32 10 2 155 

24913 8 47 39 6 0 125 

P шонча 383 11 44 3 1 1 [5 

4,605 10 45 40 5 G 116 

989 6 48 41 5 1 114 

22] 8 4 37 9 j| 14 

196 9 44 34 9 111.0 

M9 6 46 33 15 0 110 

766 10 42 22 22 © 1108 

Агтогег-Сиппег , 12,579 5 47 43 5 0 110.8 
Airplane Mechanic 16474 6 46 44 4 0 108 
Radio Operator, High 2299 6 46 39 8 1 04 
Engine Mechanic, DO ... 762 3,331 6 45 38 11 ү 1104 
Photo-Lithographer .. «x. 107 403 7 44 40 8 1 1100 
Machinist ............. . l4 2898 6 44 37 12 ] [100 
Airplane & Engine Mechanic 747 103,42 6 44 40 9 1 1100 
File Clerk rosene „чеш зук іы в 4 36 D 5 i50 
Aerial Torpedo Mechanic 212 6 44 40 8 0 109.6 
eletype Operator ......... 3769 7 42 4 7 1 109.6 
Central Office Repairman 257 3 46 АЗ s 1 1080 
Tow Reel Operator ............... 165 4 42 40 10 2 108. 
Camouflage Technician 468 7 39 42 0 107.8 
Fuel Tank Repairman 335 4 41 47 a 1 107.8 
Tocurement Inspector ............ 562 114 10 36 38 4 

Chief Radar Operator, Designated 6 0 107. 
СОЕ MINE ар ла арий 774 124 6 37 51 É 0 1072 
Radio Operator, Low Speed ....... 776 2,622 6 38 43 n 0 1072 
Airplane Sheet Metal Worker ...... 555 5716 5 39 42 1 0 106.8 
Meat & Dairy Inspector ........... 120 1092 5 38 44 13 2 1064 
Ammunition NCO 1160 6 37 38 17 1 1064 
Lithographic Pressman 342 8 37 39 15 1 1064 
Message Center Clerk .. 975 9 40 35 15 1 106: 
Sanitary Technician 7233 4 35 57 3 2 1062 
Dental Lab Technician 510 4 38 42 d 1 1062 

Installer Repairman, Tp. & Tg. 770 4 37 47 1 


ARMY GENERAL CLASSIFICATION TESTS 345 
TABLE 1 (Continued) 
Title SSN* N I IH IIV V Mediant 
Motor Inspector .................. 413 2535 3 39 40 17 1 1060 
fterinary Technician ...........- 250 184 10 32 40 17 1 1060 
dial NGG „ыш муенса 673 1307 7 36 36 19 2 1060 
Duplicating Machine Operator ..... 128 — 700 8 35 37 19 1 1060 
arachute Rigger-Repairman .....- 620 299 4 36 47 13 0 1058 
ider Mechanic ................—- 559 2426 2 37 52 9 O 1058 
ШУ Clerks кз sneno vesna ERT 835 10431 6 35 40 18 1 1056 
SOT IND сеу шыр ме etes 813 2327 4 32 40 21 3 1050 
Ets (NCO. in escas аркалы; $2 102 5 37 32 18 8 1050 
ent] Teelinigian: тылы. кийи 855 2755 3 36 42 18 2 1046 
ж Boat Operator .............. 702 237 3 37 37 19 4 1046 
ectionist, Motion Picture ....-- 137 1001 4 35 39 20 2 104 
Electrician”. non, Pigrure sess: 078 2462 4 34 30 19 3 1040 
poter MUS 015 419 4 34 40 I8 4 1040 
те e armen осна ко 0% 601 5 35 39 21 2 1038 
rVeying, Rod '& Chai infi: 191 107 4 34 38 20 4 1034 
Oreman, Wareheme Сһаіптап s. 25) — 1806 6 32 33 25 4 1028 
De ШӨ аш эк cercate Древен 458 214 7 29 38 24 2 1026 
*contaminati eus ы 
NN. cep a ны 800 1129 6 30 38 24 2 1026 
Aer a nci 511 2257 5 31 37 24 3 1024 
elder Combination |... 256 3940 3 30 44 22 1 1022 
NE ОГ, Auto ааа асе асан» 348 494 2 34 35 25 4 1020 
ess Sergeant оО. S24 567 4 30 39 23 4 101$ 
eet Metal Worker у... 201 287 2 32 38 23 5 1016 
onstruction Foreman су. 059 1525 8 28 33 25 6 1016 
p Plane Woodworker с. 550 195 5 24 42 27 2 1000 
RO NGO annaran илеш 566 14814 5 28 34 26 7 1000 
‘rine Oiler отт 141 266 1 28 41 28 2 998 
ИША! Techniki coron oc cree 5614 2 24 51 21 2 998 
lacks idtm PE EEN 89 0 4 22 9 п 6 99.8 
[gos чун кынан ев езана» CE E. 5 f 
Rigger ^. con Operator sah. ge арашы ah 319 5 27 32 32 4 988 
Ape Fabric & Dope Worker ...... 548 538 2 22 46 29 1 988 
cole Seaman 065 1050 5 28 33 31 3 988 
Gable Splicer 039 156 3 24 41 29 3 988 
тер опеег 398 5 26 33 30 6 984 
9rmati 4012 3 23 41 5 
Painter, Mp ud Operator 1002 2 24 40 30 4 982 
QI? Fighter 1698 3 25 36 31 5 978 
winner, AA 237 3 16 49 29 3 974 
Mitthouseman у, мышы. 566 3 26 33 32 6 974 
punitions Work 8,553 3 23 37 34 3 970 
punter, Automobile 120 2 19 43 30 6 968 
ainter, General 937 3 24 35 33 5 968 
Mave Shovel Operat 17 6 18 38 29 9 964 
аа e m FREES St 
one Swi 55 021 42 i 
Hund Obserichboard Operator ... б 9% 5 м 4 37 0 960 
ara Nh AN „+. — 
т, Operator асот Machine 359 1,306 20 41 31 7 958 
Ene Кво Keres eee 242 478 1 24 35 30 10 956 
Smeman Operator aas a manaa OS] 388 3 25 30 31 11 954 


346 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 1 (Continued) 


Ш IV У Mediant 


Title SSN* N LJ 

Auto Mechanic _.................. 0M 540 2 19 39 M 6 92 
Field Wire Chief ................... 595 7220 1 23 34 36 6 т, 
аар КЫ Ер есы 017 3,348 2 19 37 36 6 oid 
8,47 2 19 37 36 6 $5 

6069 1 18 39 38 4 Oty 

iS 2 3 3 6 MO 

326 1 20 36 34 9 Srg 

257 2 22, 32 33 П Ede 

6454 2 19 35 37 7 94 

3089 2 19 35 37 7 $59 

17 1 23 30 43 3 96 

iz 1 5 4) 40 4 S 

Heavy Auto Equipment Operator ... 931 5,865 2 14 38 38 ? 912 
De Repairman „а.а 204 56 1 14 37 42 2 912 
Tractor Driver’ eere ы ккк 24 1118 1 15 36 36 12 йб 
Searchlight Operator we HÀ 214 2 16 33 4 H 90.6 
Field Lineman ....... S. 641 363 1 21 29 43 90.0 
Ammunition Handler 504 229 2 14 34 40 P 90.0 
Оа 060 36279 1 14 35 42 8 гол 
Jackhammer Operator ... ms 339 111 2 18 30 40 1 89.1 
Demolition Specialist ...... ss 33 388 2 18 29 39 Y: 88.8 
Messenger ............... . 675 178 2 15 32 € 879 
Control Station Operator .. . 544 519 2 15 30 42 1) 873 
Machine Gunner, АА... . 606 641 1 14 31 47 7 870 
Packing Case Maker U . 203 26 3 17 25 48 7 $70 
Guard-Patrolman ч. . 522 41,787 1 11 33 47 0 870 
Ambulance Driver .... .. 699 164 2 13 31 45 2 861 
Hospital Orderly ................. 30 4,748 1 12 31 44 17 gig 
Lineman, Tel. & Tel. лы айка 238 3,865 1 10 32 30 0 86 
Automatic Rifleman у... уу, 746 153 0 26 15 50 jj 846 
CünuüBéeb ы» зше вд лы Do 531 1,023 1 9 32 ^ 15 83.7 
БО Р юу киш C dar n 02 за 10 28 4 D g4 
Laundry Machine Operator .. ». 0d 12 3 15 22 45 18 83.1 
ЗӨ? oen vivre as мыры .. $21 105,140 2 13 25 42 6 gil 
Engine Test Operator ............. 520 195 16 17 Ш А 16 816 
Auto Equipment Operator 345 14287 1 10 26 mA 21 80.1 
Orderly 695 2675 1 9 25 29 73.0 
Rifleman 745 1,534 0 17 12 ri 30 1750 
Тавон 590 1254 1 9 D 4 5 y 
Half-Track Driver... сти dà 0 9 B $3 6 
Airplane Handler .. Mi 2415 1 5 14 € 103.7 


po 774,383 6.3 33.2 33.1 21.8 


Е i 

The AAF, because of its need for men to be trained pe 
skilled jobs, received in general men who scored ne 

Selective Service average in GCT. Due to this ae “One 

might be wondered why the AAF is not even higher. the 

cause for reducing the apparent AAF-Army difference Wt in 

draining of high GCT men into officer training. Res 


t 
j 


ARMY GENERAL CLASSIFICATION TESTS 347 


Table 1 are only for enlisted men and do not include officers. 
The Army data were for the selective service intake which in- 
cluded many men who were later to become officers. The mini- 
mum GCT requirement for officer training was 110. 
_ The cases shown in Table 1 are those of AAF enlisted men 
in the continental United States in August 1943. Cases have 
been omitted where records were incomplete and where there 
Were fewer than 100 cases reported for a specialty. Such 
Omissions were relatively few and the results as shown are for 
Practically all of the AAF enlisted men who were in the country. 
at the time stated. Air crew specialists, such as Aerial Gunners, 
Who are often treated separately, are included as are ground 
crew, such as Airplane and Engine Mechanics. Not only are 
ir Corps included but also included are men in services with 
the Air Forces, i.e., Engineers, Ordnance, Quartermaster, Medi- 
cal, and Finance. 

The medians range from 136.7 for Weather Forecaster to 
67.8 for Airplane Handler. Ninety-six per cent of enlisted 
`Y eather Forecasters possess GCT scores of 110 or above, which 
55 the minimum GCT requirement for officer candidate school. 

ifferences in GCT levels for different specialties were 
Probably caused mainly by job requirements, and in part by 
the standards for entrance to technical school courses. 
àl S probably occurs in civilian life, the supervisor is not 
Ways brighter than the people he supervises. For example, 
father Station Chiefs score lower (Median 121.4) than both 
eather Forecasters and Weather Observers (Medians 135.7 
E 1262). This is of particular note since the Weather Sta- 
ton Chief must be rated as a Weather Forecaster (4). 
"T n the other hand a most frequent hierarchy among AAF 
‘Sted men shows a regular, although small, increase of test 


Score with ; X EE S ] Airplane and En- 
: n incr onsibility. Several Ар : 
Sie easing resp A Crew Chief and a 


Line i e maintain a single plane. A ef ang 
ief customarily are trained first as Airplane and Engine 

сар anics and are promoted оп the basis of competence to be- 
air © Crew Chief, who is the head maintenance man for one 
Plane, or to become Line Chief, who is the head maintenance 

^ lor several airplanes. The medians are Airplane and En- 


348 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


gine Mechanic, 110.0; Crew Chief, 112.6; Line Chief, 116.0. A 
‘possible explanation of the apparent difference between the 
weather specialist results and the maintenance results is that 
within a job family the relationship between level of respons! 
bility and test score is a straight line up to approximately 12 
points, but does not apply above that score. 


TABLE 2 d 
Comparison of Army General Classification Test Results between Military an 
Civilian Occupations (White and Colored) 
Military Civilian _ wil. Med: 
Title -Civ. Med. 
N Ма м Me 
‚9 
Bandsmen (Musician) 5,717 113.4 162 112.5 o 
AISE roseis cernens 275 1152 51. 051 04 
Tab. Machine Operator .......... 1190 1194 140 1198 08 
Jtr mc MMC ENS .. 2,898 1100 457 108 717 
Weldar. „сеат ы кнага .. 3,940 102.2 500 103.4 “18 
Clerk-Typist РРР, . 6,881 115.4 472 117.2 -18 
Painter, General ........... Ё 937 96.8 474 98.6 d 
Cool BIER аа шннен. 39627 90.0 552 923 539 
Public Relations Man ........... 560 126 4 155 735 
Stenographer n.a 95 1180 18 113 7 
Photographer ................... 500 1154 96 116 51 
ИШЕ гаан 291 111.4 133 116.5 -5.5 
Draftsman .............. uuu 2,530 116.2 153 191.7 260 
Auto Mechanic ................. 5,491 95.2 500 1012 -63 
Electrician 0. 2462 1040 298 1103 765 
Sheet Metal Worker ............ 101.6 500 1081 776 
Meat Cutter .. d "s s 287 96.4 272 1040 -80 
Tractor Driver ...... У 91.2 389 992 205 
Auto Serviceman .. sie 94.0 600 103.4 -98 
Carpenter „ана 93.4 479 103.2 29.8 
Barbar sce ns suisses soam iem gr ш i -0 
Installer Repairman, Tel. & Tel. .. 106.2 6 n 6 -108 
ES Stan RE sip HERR US SUR M sels seins 93.8 131 104 12 
; acis 
= к ЫШ um qi cB 


* Medians calculated by interpolation, from per cent in each grade. се 


The results shown in Table 1 may be of some interes її 
to the military services in revising desirable minimum 4 pm 
cations for various specialties. The results may also : a ji 
terest in relation to certain civilian jobs. With ivika o 25 
cations in view, Table 2 has been prepared which t or 
occupations a comparison between Civilian and eee AM 
cupations on ССТ. This table shows that in general t 


ARMY GENERAL CLASSIFICATION TESTS 349 


Eu lesser capacity than did civilian business and in- 
аы he са аге 5-14 points lower for the military as 
Gena: Ba to ен» in the cases of the following occupa- 
Metal Wie raftsman, Auto Mechanic, Electrician, Sheet 
E E er, Meat Cutter, Tractor Driver, and Laborer. In 
bond. І the AAF use men of appreciably higher average 
seis e - civilian industry and business. No doubt the men 
$ gh scores were diverted into combat crew training and 


Into i a ; i ivili 
ee technical training for jobs which had no civilian counter- 
S: 


Lg REFERENCES 
j B necs Government Printing Office, 615-26, 1942, p. 
2, ; Е 

Boring, E. G. (editor). Psychology for the Armed Services. 
3. Ibid Washington: Infantry Journal, 1945, p. 241. 
1 А 


e Staff, Personnel Research Section, Classification and Enlisted 
eplacement Training Branch, The Adjutant General's 
ce. “Personnel Research in the Army. 11. The Classi- 
fication System and the Place of Testing.” Psychological 
Bulletin, XL (1943), 205-211. 


RELATION OF TEST SCORES TO AGE AND 
EDUCATION FOR ADULT WORKERS 


D. WELTY LEFEVER, ALICE VAN BOVEN AND JOSEPH BANARER 
San Bernardino Air Technical Service Command 


THE question is frequently raised, especially by shop fore- 
Men in connection with testing programs in industry: Is it fair 
to expect older workers to compete on paper-and-pencil tests 
With those considerably younger in years? The effect of age 
Оп mental alertness may well be a handicap in addition to that 
represented by the greater interval of years since the older 
Worker attended school and endeavored to read and answer 
Written questions. A second and similar problem is concerned 
with the relationship of test scores to the amount of schooling 
Sotained by each worker. The personnel testing program at 
the San Bernardino Depot of the Air Technical Service Com- 
mand provided an opportunity to assemble pertinent data from 

€ Scores on certain aptitude tests as well as from the results 
t administering a series of job information tests developed at 
€ depot. 

The initial analysis to be presented includes a graphic com- 
rw of the standard scores on a number of job information 

te for a group of men and a group of women mechanics 

“onging to several age levels. These data are shown in Fig- 
M. : The job information tests included those for Sheet 
Clo Service Mechanic, Parachute Packing and mm 
Recep ора» nde Các Mg 
and Itioning, Paint and Dope, Spark Plugs, е + 
athe Operator, all of which were administered to workers 
Was the supervisorial grade. In this group of таш. еге 
ui pito introduced a test in Warehousing which was а m 
inc] to all warehousemen from the grade of junior up to an 
“uding the supervisors. Standard scores were employed to 


351 


below 


352 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


make the tests more nearly comparable for number of items and 
for difficulty. 

The similarity of the graphs for the two sexes is noteworthy. 
The workers under 20 years of age averaged about a third of a 
standard deviation below the mean of the total population. 
A like drop characterized the scores of those from 50 to 59 years 
of age. For those beyond sixty the average approaches а ha 
standard deviation below the general norm. These deviations 


5 
RELATIONSHIP OF AGE TO CERTAIN JOB INFORMATION TEST score 


60 + T 


nu 50 
c 
g F 
o 
ә 
м 
а. | S НР | 
a 
є 
E UU s 
нз 
2 avenaces|rrom 1! TEST 
4 
m — WOMEN 
on 
Ne 430 T 
30 — | АНДАШ 


АВЕ 30 TO 39| 40 TO 49 | 50 TO 59 
NUMBER OF MEN 
NUMBER: OF WOEN] 


Figure I 


ge 
do not appear to be very great in comparison to the wide E 
of ages represented. The mean scores for the intermedi? they 
range from 20 to 49 years, are relatively constant i jo 
indicate the period of maximum test efficiency ап 
knowledge. 

That the younger and inexperienced worker shou 
less information about his job is to be expected, but the 
for the drop in scores among the older workers are mor sant 
plex. The problem of “selection” is not entirely clea™ |. ipf 
older workers were engaged in such jobs as spark plug Т”. 
clothing repair, etc., which do not attract the more ipte 


55 
085 
Id por в 


reas 


^ 


TEST SCORES AND ADULT WORKERS 353 


worker. In the case of the men, the selection was directly 
affected by the war since men over military age were not under 
Stress from their draft boards to seek defense work. The 
younger men who remained in civilian status were of high 
€nough grade to merit occupational deferment. Such a group 
could be expected to achieve higher test scores. The same 
Teasoning can be offered to explain the fact that the peak in 
test scores was reached by the men in their thirties but by the 
Women in their forties. Perhaps it is gratifying to learn that 
Women in the 40 to 49 age group, many of whom had never 

efore worked outside their homes, could learn new skills and 
readily adjust to an industrial situation and could master fairly 
Complex information regarding their jobs. 


Correlations Between Age and Job Information 
Test Scores 


The correlation coefficients were computed between age and 
the scores on the tests included in Figure I. These coefficients 
bius Calculated for the whole age range, for a curtailed age 
range with older workers excluded, and for a curtailed range 
rom which the younger workers were omitted. The results 
are Presented in Table 1. When the full age range was used, 
the median of the correlation coefficients was found to be – .06. 

Ote that the correlations obtained for groups of men were all 
More Strongly negative than the median, whereas the groups 
composed of women produced coefficients which were positive 
3 very nearly equal to zero. Only one correlation, that for the 
Male group of warehousemen, was definitely non-linear. The 

IStribution of scores in this group was somewhat skewed be- 
cause of the scarcity of men of military age in the warehouses. 
m Younger men in the group Were ae. (and m. 
pore draft deferment) and scored high on the test. The 

Tage age for the male warehouseman was 48 years while the 

“an for the female warehouseman was 35 years. “a 
nated en the older workers (over fifty years of € ng a 

hm from the computation, the median correlation ^ 
"a Series of tests became .08. The groups of women workers 
Yield the values above the median while the negative cor- 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


354 


Mqeqod=dy 


IZ 49A0 sio3104 


05 Jopun S19410A4 


$193104 Jo dno13 OYM 


terssa зишрәрү 
UsWOAL pue uw 


$оїиецэәш 991A198 


EP чәшом 
DOTEM чәрү 
pru зәәч$ 


к uwo 


tt uW 
51593 8 jo dnog 


ow eee чәшодү 


Же» цо 
Sursnoyare дү 


[—$$<$<$___ 


dnoi3 рив asap 


324028 172] wonpuzofur qof pup aay 220019 700101207) 


TEST SCORES AND ADULT WORKERS 355 


relations are associated with the male groups. Even with this 
| curtailed age range, the distribution for male warehousemen 
does not produce linear regression. The latter group was the 
only one under the fifty-year age limit which indicated the 
Presence of an age handicap. In computing the data for Table 
| 1 the test for linearity was not applied to the curtailed ranges 
When the ful] range of ages produced linear regression. 
The age range produced by eliminating the workers under 
years of age yielded a median correlation coefficient of — .03. 
‚ CBative coefficients were associated only with groups contain- 
Ng male workers. 
he correlation coefficients thus far reported (except for the 
male warehousemen) indicate that age is not a serious handicap 
to the adult worker in taking job information tests. 


Correlations Between Age and Aptitude Test Scores 


Table 2 presents the Pearson product moment coefficients 
for the Correlation of age and the scores on a series of aptitude 
tests. The median coefficient for the Otis and Wonderlic in- 
telligence tests was—.15. The Learning Ability tests developed 
4 eadquarters, Air Materiel Command, show по handicap 
9r the older worker since the median coefficient was .02. Since 
Some of the younger workers had graduated from the local high 
Schoo] where the Otis test had been administered, it is possible 
that the Otis scores were unduly high for this age group of 
mechanic learners; such a condition would contribute to a high 
“Bative Correlation. 
€ Civil Service Clerical Examination does not appear to 
ave discriminated against the older applicant. However, when 
b “Te group of warehousemen (composed about equally of 
oth Sexes) took the ATSC Clerical Aptitude Test, Form B, the 
results indicated that the older workers were somewhat slower 
than € younger ones. The coefficient was — .30. А 
Ог the series of mechanical aptitude and number checking 
apa | OWN in Table 2 the only evidence of a serious degree of 
eo andicap appears again in association with the male group 
“@rehousemen, Here the Pearson correlation was-.4l _ 
п Table 3 an analysis similar to that shown in Table 1 is 


tests 


356 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


: : sd it 
reported for certain aptitude tests. Included are five e : 
aptitude tests selected because negative correlations with a 

The оет- 
resulted from the use of the total age range. The median c 


TABLE 2 


Correlations Between Age and Aptitude Test Scores 


. Number Correlation 
: ееп 
Aptitude test groups of eases coefficie 
Otis 
Female mechanic learners ............ 220 


Male mechanic learners .. 
Female warehousemen ... 
Male warehousemen ... 


Wonderlic E 
Female mechanic learners ............ 220 2218 
Male mechanic learners .............. 75 

2-415 

Median coefficient for above gTOUpS ....... 

Learning Ability, Form 5 E 
БЕГИ ras ai a чыгана na rose p fede 100 Я 

Learning Ability, Form 7 22 

emale sheet metal workers ‘00 

ale sheet metal workers 02 

ixed sheet metal workers 16 
Female clerical applicants 

Median coefficient, 02 
Learning Ability Test groups ........ 

Median coefficient, -.05 
All intelligence test BOPE ЖИН saas 

Civil Service Clerk - .06 
Female clerical applicants ............ 91 

Clerical Aptitude, Form B 25,231. 
Female warehousemen .............. 160 -.30 
Male warehousemen 

Median coefficient, 2,30 
Clerical test groups .................. 

Visualization 2105 
Female warehousemen ............... 160 -.08 
Male warehousemen ................. 150 

Spatial Judgment 01 

= Al 
= 14 


m 
{ : р : e for ^ le 
gressions were linear with the exception of thos and mal 
i 


+ is test 
mechanic learners and warehousemen on the Otis t 


F x жей 
16 All тё 
cient for this group for the whole age range was — -1° gle 
warehousemen on the ATSC Clerical Aptitude Test. 


357 


ST SCORES AND ADULT WORKERS 


TE 


or- w- w 7 ев тка АА 
60 oF 60 - 19 ort! C SE qu so qt “© вашка 
эшецзэәш 
әүешә] pue sey 
pavog шло] 
tagog vioseuu рү 
90 - SET ss | 69 |1 | 10- 091 ee 
10 981 -alem әјешә у 
p= FI Ф- 18 9 | JS w| эы OST а 
-osnoqo1eA әр 
Way 
quawspnf 71045 
87 - 91 ww sel $6 | TT c£ | ie- 091 SSRA ОШОО 
-asem әјешә 
000° | S'SS | 8s Де FFI 000 | LOZ | Ӯ | se- 18 800 | 9ST | 6€ | O£— OST кс Чаш 
-әзпоцәлел\ ALW 
aL 
apnindy 10914279 
007 set 16 | ЕЕ ST | £0- 01 yu чәшәзпоц 
“DIEM әјешә g 
100" | ЕВГ | 6e | 0@- WI 200 | ETE | Zr | 0£— 18 £0 | Sez | 6€ | et OST rr e "ЦА 
-әѕпоцәлем ALW 
or 961 gps 161 LE $9 9£ eg 0 Н STRUTEYl 
эїшецзәш әүешә у 
$00 | КО | SF | £0'- 95 +8 | 0с 8€ | t£- [77 000 | Tee | 8F C= SL ua! 
эїшецзәш ALN 
1521 S40 
[2 d 961 a 161 99 | CF Ic 98 = Uc p" SE 
эїшрүзәш әјешә g 
+O 95 [ui cL 6r | 9s RE p sep E ibm es! 
3iue22ur ALN 
| 272] zipapuo y 
sosva mm SƏSLI 
d ш 4 jo d E u 4 jo 4 „^2 ш 2 jo 
Joquinwr daqunyy Jaquinyy dno13 pue 3s] 
Ig 1940 SI9X0 A1 05 1opun S12310/4 / SIIYIOAM JO dnoiz 3[ouM 


SISUDY PFY рәп.) рир уту 40f гугл у эрлуу рир ISP UIIBIIY FUOMUAM4Q7) 


£ WTHVIL 


358 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Eliminating the workers over fifty did not improve the 
measures of relationship; the median correlation became — 22. 
A greater difference was introduced by omitting the workers 
under twenty-one from the scatter diagrams. The median 
coefficient for the curtailed range proved to be — .10. 

The general conclusions for the Otis and Wonderlic tests 
seem to be that the younger workers, many of whom had at- 
tended high school not long before taking the test, made rather 
high scores. With this group eliminated from the computation 
the correlations indicate little handicap because of age. For the 
ATSC Learning Ability tests the younger.workers need not be 
omitted to demonstrate that age was not a handicap. The 
scores on the ATSC Clerical Aptitude Test point to a stea y 
slowing down with advancing age. The results for the me“ 
chanical aptitude tests are not consistent enough to warrant 
a generalization. 


The Relationship of Education to Job Information 
Test Scores 
e job infor 


Figure II presents a graphic comparison of averag i 
kers class! 


mation test scores for groups of men and women wor 


core? 
RELATIONSHIP OF EDUCATION TO CERTAIN JOB INFORMATION TEST 5 


60; 
wo 
w 50 
[4 
o 
o 
а 
м 
40 
о 
E uM 
á 
< 
: ү Мез конто TESTS] 
SCHOOL GRADE | это» | TOS 6 тот 8 э ro io | и тог 
NUMBER OF MEN т = 2 


NUMBER OF WOME! 


35 
4 2 16 


Figure II 


TEST SCORES AND ADULT WORKERS 359 


E 2E Bus е grade level reached in school. The curves 
ene bes eoa about the way in which test achieve- 
ister me the amount of schooling for men and women 
E 2 ves small differences in test scores are indicated 
и s oe аңыр ә small groups with less than an 
i p^ = e education. Here the handicap is much more evi- 
Th e men than for the women. 

т каш between age and education for a sampling 

ndred men was computed to be —.35; that for 180 


Р ТАВІЕ 4 
nalysis of Correlations for Age, Education, and Job Information Test 
T Correla Partial r, E puel Partial 
" М 
est and Group Sex шй tion test FI vs. r,age 
of cases | үс age inu educa- constant 
E tion 
P огтаціоп Test in | 
Const Metal ..... F 90 34 41 27 35 
n posite of 8 Job ` i 4 : i 
ompa nation Tests .. F 180 = -.01 18 AT 
noite of 8 Job 
Job Inpation Tess .. | M 100 | -.26 | -20 21 12 
in Ta ration Test 
Job таге Operating . M 31 -.08 -.01 35 34 
f nformation Test 
us | 
gp chanics |... 2 02 32 32 
rt, Metal Workers Both | 150 06 
Pons Ability, 
Learning. AS F 90 22 30 35 40 
n b 
сүрт Teen us Е 91 16 21 26 28 
Medians е Clerical F 91 E te E 2 2 


hat many of the older 


Wo 
Men was.. 16. Thus, it may be seen t 
f the total population 


Work, 
ere ет whose scores fell below the mean o 
also those with less education. 


ls m Joint relationship of age and e test 
Const yzed in Table 4. When the factor of education is held 
ога ant, the median value for the partial correlation coefficient 
in , 5. and test score approaches Zero. The slight age handicap 


in tal: 5 
par paper-and-pencil tests is apparently in part the effect 


all what fewer years of schooling for the older worker. If 
€ testees had completed the same number of years of 


ducation to test scores 


360 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


schooling, the age handicap would probably have been reduced. 
Even when education is held constant (by mathematical for- 
mula), age seems to be more of a handicap to the older men 
than to the older women. 

The partial correlation coefficients produced when age W 
held constant differ but slightly from the total coefficients be- 
tween test scores and education. The relationship betwee? 
years of schooling and job information test scores is low but 
positive, indicating that education does assist the testee f? 
certain degree in making better scores on a paper-and-penc 
test. ‘ob 

The average education of the workers who took Ше 
information tests reported in this paper was found to be a lit 
above the ninth grade. A safe conclusion would appear to ae 
that the worker group as a whole possessed sufficient Ha 
tional background to preclude any serious handicap in -— 
spect on paper-and-pencil tests. Those with less than wer 
grade schooling should perhaps be accorded a certain amoun 
special consideration. 


as 


Summary ‘ally 
fifty (especi? 


Workers under twenty years of age or over -on tests 
formation ©. 


men over fifty) do not average as high on job in dicap Í 
as workers between twenty and fifty but the age han if 
slight. When the factor of education was held лее pro 
mathematical formula, age and job information test о il- 
duced a median correlation of zero. The ATSC e m 

ity Test appears to involve no handicap for о sence 
the other hand, the clerical test findings reveal the pre 

an increasing handicap with advancing age. in 
Schooling does not appear to be a critical factor 1. 
mining job information test scores except 
than sixth-grade education. 


га 


dett 
Jess 


h 
for those WIt 


— o 


TEST SELECTION: A PROCESS OF COUNSELING 


EDWARD S. BORDIN лхо RAY Н. BIXLER 
University of Minnesota 
i ptatem of counseling and psychotherapy is marked by 
ан pe Page an For the most part these viewpoints 
the x з it nin. the framework of two major settings, namely 
Serb. ing of educational and vocational decisions or the 
aed mec out of problems involving highly personalized feelings 
а ои Counselors working within the former setting 
найын у со Бе most concerned with tests, the technology of 
Шеш ton, and job analyses. Counselors in the latter setting 
E de likely to have their attention focused upon the need 
ing ore effective methods of handling attitudes and feelings 
е interview. 

oes of these preoccupations has in its turn led to or been 
ee lated with significant contributions toward the increased 
tiveness of counseling. Much has been achieved in the 
me us and evaluation of tests and other devices. Pre- 
itits h atteries have been established, and the limits of many 
is stil ачи defined. Although the accuracy of prediction 
imited and necessitates rough and ready clinical judg- 


me : Е : 

ah, increasingly greater proportions of human behavior are 
i * aa ^ 

ng measured systematically. Similarly, considerable prog- 


Tes а : А 
$ has been made and is still being made in the development 


= detailed description of interview processes which are appro- 

"iate and effective for handling attitudes and feelings. 
ere o tnseling is entering a period in which it will become in- 
iew gly important for integrations to be made in the inter- 
and ые involving the use of the technology of prediction 
€ handling of attitudes and feelings. The provision of 


аг-г i : К 
e А 

. caching educational and counseling services for veterans 
be seeking counseling while in 


™ mean that individuals will n 
des and feelings as well 


e Е 3 
Process of reconstructing their attitu 


362 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


as their educational and vocational goals. Public attention 
has been and will continue to be focused upon the adequacy 
of these services to veterans. It is important that counselors 
deal with both aspects of their problems if the psychological 
profession is to avoid detrimental repercussions. 

| The purpose of this paper is to describe and to illustr. 
interview procedures designed to provide for the selection 0 
tests when the counselor has the dual problem of aiding the 
chent to make an educational and/or vocational choice and to 
reorient his attitudes and feelings. The goal is not merely t9 
insulate each objective so that it does not impede the other, 
but rather to suggest interview procedures whereby progres? 
toward one goal means progress toward the other. d 
The underlying orientation in the procedure to be discusse 

is that clients can deal most effectively with their own feel 
and attitudes when they are active participants in the intervin 
process, when they are permitted to attack their problems f 
their own terms, and when they are permitted to choose the! 
own directions in grappling with their problems. 


ate 


The Setting for Counseling he 

Clients coming to the Student Counseling Bureau of E. 
University of Minnesota are likely to couch their initial sta 3 
ments of their problems in terms of vocational and/or © n 
tional choice. A large proportion of them are graduating P. - 
School seniors who have been referred by their school c? 
selors, teachers, or other high-school personnel „workers: 
most cases the referral has been in terms of an opportuni h 
obtain vocational or educational advice. The degree m the 
the Bureau is accepted throughout the state is indicate y at- 
large numbers of clients who seek its services merely aS pt ns. 
ter of playing safe in their educational and vocational dae 5 
The total effect of this is to orient clients toward takne T 
€ receiving aid with problems of educational and vocat 
choice. 


Interview P: ocedur е i 
Jons 


Bo ж + 1 es 
Under these conditions clients tend to project "ud fre 
bility upon the referral agent or the counselor. The m 


TEST SELECTION 363 


quent response to the counselor’s introductory question about 
why they came to the Counseling Bureau is, “I thought I would 
оше in and take the tests to find out what is best for me to 
do,” or often they take even less responsibility, saying, “Miss 
, my counselor at High School, thought 
I ought to take the tests.” Students enrolled in the University 
are likely to make similar statements, e.g., “I was having a little 
trouble with my English, and my advisor thought I ought to 
come in and see what you could tell me.” Counselor responses 
hich enable the client to clarify his concept of how the prob- 
em is best solved and to determine his role in the solution often 
ead directly to the expression of the attitude that he thinks 
tests will help him. 
| At this point it is not unusual for counselors to assume com- 
mi responsibility for selecting a set of tests which, from the 
mation they have obtained, appears to be appropriate. 
Any times, in order to select appropriate tests, it is necessary 
Es counselor to ask a series of probing questions, thus re- 
ia ane in a subtle yet effective way the d that he 
cedy Ing the responsibility for action and decision. zu ME 
pean, appears to have merit, since the counselor is skilled in 
aa ‘ction and test selection. Yet, to yield to this temptation 
‚ exercise his skill will be to run the danger of depriving the 


c , Ў 
lent of th expression which may lead toa 


теў е possibilities of self- П ә 
Sion of his view of his problem. It will probably make him 


а dependent on the counselor, not only by ae a 
сш "Iptive role but also by limiting the client's e iness vu 
unde, use of test information for the development of better se : 
acti "standing and the initiation and execution of ee o 
or. To state it another way, by placing too much emphasis 

А n efficient and comprehensive collection of test data A 
ns of solving human problems, the counselor assumes the 

3 E Dot achieving this end of counseling. As an alternative, 
One idet that the process of selecting tests be a cooperative 

ared by the client and the counselor. 

П order to make it possible for the client to share in the 
*55 alterations have to be made in traditional procedures. 
° Counselor must describe in non-technical language the 


Proc 


364 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


judgments the client can obtain about himself from various 
tests, permitting the client to assume the responsibility for 
deciding which judgments will be helpful to him in working out 
his problem." This does not mean that the client is completely 
on his own" in trying to decide. As he tries to puzzle things 
out, perhaps struggling with anxieties about the possible ac 
verse results of taking a test, the counselor helps him to clarify 
his feelings and to overcome the obstacles to accepting Һе": 
The counselor assumes the responsibility for selecting in each 
area the test which is the most accurate for obtaining the judg- 
ment desired. For example, the counselor decides whether on 
not the Ohio State Psychological Examination or the miller 
Analogies Test is the most appropriate test after the client him- 
self decides whether or not he wants a measure of college айй 
tude. Some counselors may doubt whether or not tests вар, 
made sufficiently understandable to the client for him to deci’? 
which ones he wants. Our experience gives little grou” 
this concern, since the majority of clients seem to request = 
which are suited to the prediction they desire. lem 
Following a preliminary discussion of the client’s prob std 
as he sees it (providing, of course, the student feels that M 
are instrumental to the solution of his problem), the at 


they have received more satisfactory results whe 
described tests in terms of the functions involved, fol 
examples of jobs in which the functions are important- being 
The following statements are representative of those -ple 
used at present by the senior author in describing ir fre? 
tests. Following the presentation of each test, the client ee he 
to discuss his reactions to it and to decide whether OF in on" 
wishes to take the test. The statements are undergoing indi 
чаш revision and should not be considered as more t pan d 
cations of what may be said. They should be adjusted " rest? 
available battery of tests and the significance of thos? eing 
demonstrated for the particular setting in which they 27€ 
used. 
hen 5 


lowe 


wi 
li 1 We feel that the method described in this article does not apply an арт 
client has been referred by another agency for purposes of diagnosis, en for P 
cant referred by the Board of Admissions for testing and recommenda 
poses of admission to the University. 


TEST SELECTION 365 


One type of test we have is one that gets at your general 
learning ability. You can get a comparison of your common- 
sense learning ability and your book-learning ability with that 
of the general run of people (Wechsler Adult and Adolescent 
Scales). If you wish, you can get a comparison of your book- 
learning ability with that of college students (American Coun- 
cil or Ohio State Psychological Examinations). We find that 
this last kind of measure when taken along with rank in high- 
school graduating class is the most accurate basis for predicting 
what a student will do in most types of college curricula. 

Another type of test that we have is one which compares 
how much you know in specific subjects with how much others 
know. For the most part these tests do not predict anything 
about you; but certain ones, under special conditions, do. For 
example, one test compares your knowledge of high-school 
mathematics with that of entering freshmen in engineering 
who have had about the same amount of high-school mathe- 
Matics as you have had (Cooperative Mathematics Test). 

Cores on this test, when taken along with your rank in your 
nigh-school graduating class, provide the most accurate basis 
ог predicting how well you will do in engineering. ‘Similarly, 
à test of your knowledge of the application of scientific princi- 
Ples (Johnson Science Test) and your knowledge of algebra 

operative Algebra Test), when compared with entering 
freshmen in these fields and taken along with your high-school 
Tank, give the most accurate basis for predicting grades in agri- 
Culture, forestry, and home economics. The remaining tests of 

Nowledge are merely ways of checking your impression of how 
Much you know in a particular subject or what subjects you 


‘now best. А 
„Діво, we have tests that get at тоге restricted types of 
skills. skills that are the basis 


For the most part these are z 
A Predicting how well a person would learn jobs that do not 
require college training. Some of these skills would be good 
9 have in college-trained jobs, but they are not vital. For 
xample, one test of this type gets at the ability to work 
Quickly and accurately in routine checking operations (Minne- 
Sta Clerical Aptitude Test), the sort that are required in 
Paper work in an office. ‘This is a skill that is vital for an office 
Сек or a bookkeeper. It would also be good for an account- 
ant to have but would not be so vital. Another test gets at the 
ability to see objects in a different position from the one shown 
tha ed Paper Form Board Test). It is the type of pel 
Se. enters into blueprint reading, drafting, and p ie 
Sta E Still another gets at a person's knowledge ai un er- 
eal ing of common-sense mechanical principles, his on d 
Thi know how” (Bennett Mechanical ip pm о est). 
woul aet provides a basis for predicting peu ac 
€arn a wide range of mechanical jobs. Ano yp 


366 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


test gets at a person’s ability to manipulate objects with his 
hands—the fine kinds of manipulations that are required of a 
watchmaker, an engraver, or a dentist (Finger Tool Dexterity 
Test) or the larger kinds of manipulation that are required © 
a carpenter or an auto mechanic (Spatial Relations or Manua 
Dexterity tests). . 
The tests we have talked about so far are ways of getting 
predictions as to how well you would learn some thing or 0 
getting at how much you already know. We also have tests 
that get at how you feel about things. People make up their 
minds as much or more by how they feel as by what they 
know or could learn. The main way we help people to take 
their feelings into account is by giving them the opportunity 
to talk things over with us in this kind of interview. With the 
kind of help we can give them, we find they can get a deeper 
understanding of how they feel. Many times these tests са 
help a person along in this process of puzzling things out y 
giving them new slants on how they feel about themselves. 
In one test you would indicate how you feel about youre 
in terms of occupational or occupationally-related ag 
(Strong Vocational Interest Blank). From this you mig} 5 Ps 
a new slant on how you see yourself in terms of occupati? sa 
For example, the way you feel now, you may not like the e e 
of yourself as a salesman—you just can't see yourse If as 
salesman type—but you do see yourself as the scientific likes 
of guy.2 This test gets at this feeling by comparing yOUr ^r 
and dislikes with those of successful men in various ty Aut 
occupations. Another test gets at how you feel about youries. 
more generally, not just in terms of occupational actly ion 
hat you can get out of this test is a personality descr Pec 
of yourself (Personality Test). Still another test 1s 2 [gie in 
tion of the kinds of questions people usually ask themselves re 
making up their minds (case history blank). There is де ob i 
on this test. The only help you could get from it woul think 
the process of answering the questions, if it led you Inciden- 
about something that you had not considered before. айп 
tally, it is also a convenient way of getting better аса! 
with you. 


0, 
clients ёЛ) 


| It has been our experience that the majority of e test? 
this opportunity to select their own tests and to 5€ evel 
eif 


appropriate to their individual needs. Some clients, ect © 
e 


resent this approach or feel that they are not able os t to feel 
own tests. If the counselor accepts the client's 1B . gth? 
lectinE, p. 


resentful or fearful, the client will usually continue 5° 


tests. Occasionally he will balk, as in the following ! у 


А Bordin "psy 
2 For the basis for this inter i f the S Blank see AND. 
? pretation of the Strong AL 
“A Theory of Vocational Interests as Dynamic Phenomena,” EpUcATION 
CHOLOGICAL MrasunEMENT. III (1943). 490. £6. 


TEST SELECTION 367 


“Am I selecting the right tests?” _ 

an are wondering if you’re taking the tests you ought 

“Yes, you know a lot more about this. You pick out 

whatever you think I ought to take.” 

You think I should select them because I know so much 
more about tests than you do, and you are afraid you will 
pick the wrong ones.” 

Yes, you pick them out. I don’t care. I'll take what- 
ever you want me to.” 

C. “You want me to choose the tests for you pretty badly. 
I shall be happy to tell you what kind of answers we can 
get from the different tests, but you are the one who has 

$ to select the tests you want to take.” 

© “Well, О.К” 


C. "You don't like the idea very well, but are willing to go 


ahead anyway." 
S. Laughs and acknowledges this. 


_At no time has the client ever refused to continue at this 
Point. The counselor could easily acquiesce in these instances 
Where the client rebels at the freedom of selecting his own tests. 

OWever, it seems undesirable to foster a dependent relation- 
Ship, Case files are full of records describing dependent rela- 
"onships which are finally broken off in disgust by the counselor 
ог by the client’s eventual refusal to continue having someone 
n Plan and regulate his life. It has been our experience that 

€se same clients are the ones who attempt to lean heavily 
Оп the cou what college they should 
a what courses they should take during their school year, 
Ps what extra-curricular activities they should ш x It 
th ms important for the counselor to accept their desire to have 

at type of service, but to recognize 1ts limitations and thus 
"lot fall into this pattern of client-counselor relationship. 
t seems probable that the client will be able to make 
Steater use of test results when he, himself, has requested such 
теа. We have observed less rationalization = test 
of i ts. The client tends to accept more readily the significance 
in test scores when he has taken the responsibility for select- 

8 the tests and understands what information about himself 
the can give him. We find that, under these сйешачи о 
тар; dent makes considerable use of self-observation w Чч е 

ПЕ tests, After taking the Cooperative General Mathe- 


CQ. ferio 


nselor's judgment as to 


368 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


matics Test, a student who has been thinking about engineer- 
ing says, "I am not so sure about engineering now. I saw how 
much less math. I knew than I thought." 

A wealth of diagnostic data is usually obtained as the bat- 
tery of tests are described to the client. Each and every test 
is а possible stimulant to the client's discussion of that field oF 
phase of his life. Descriptions of achievement tests usually 
bring forth à pretty clear picture of the client's likes and dislikes 
and his concept of his ability in these areas, while the opening 

B . € 
statement about the importance of high-school rank and t 
college aptitude test frequently brings forth data concerne 
Ms : E a tes i 
the client’s attitude toward his college prognosis, his grades 
high school, and his reaction to high school. 

If the counselor is oriented to the client’s attitudes, á 

H . . 1 r Г 
frequently find his reflection of these attitudes leading to yn a 
nent discussions of the client's problem and, in some cases id 

a М : : w. 
reorientation of his concept of his problem. In the follo j- 
example the counselor has just described the Cooperati¥é 
ence Achievement Test. 

' ho- 
S. "I would like to take that, but I am already a 50Р 
more." 
= ve ' 5 
€. “You feel it is too late to be thinking of science. Jes in 
S. “Yes; I like science a lot, though. I got good po гапу 
high school in physics and chemistry. I’m taking es 
now and I like it very much, but I don't remember 1t 
well." б}; 
“You do very well in it, but it doesn’t seem to suc Dad's 
Yes, I decided not to go into it when I came here- (Her 
a chemist, and my sister is a medical technician: could 
face takes on a determined look.) I always 5а! er f 
hold my own with them—but I decided it was а gainet 
me not to go into science.” (Here, the counse'o 
a wealth of diagnostic information, the studen 
taking steps toward clearer insight into her own P 
and motivations.) -~ he pring’ 

It seems well to let the client exhaust any topic hion * 

up 1n connection with any of the tests. It is in this fas — А 
. . a с 
he gradually comes to grips with his problem. The “cane 
s : Р : о 2 
who wants diagnostic data will find that he obtains E zu 
and, therefore, more dynamic and meaningful facts 


client. 


he will 


е! 


f 


TEST SELECTION 369 


The importance of permitting discussion of factors brought 
up by the client is illustrated by the following excerpt. A per- 
Sonality test has just been offered to a veteran enrolled in the 

niversity, 


7 «р . | 
| ы ve always been an exceedingly rational person.” 
*. "You don't see much value in taking such a test since you 
are so rational in nature." 


X “No. Tm rational. I'm not—I don't have feelings.” 
Y “You are steady—don’t get upset.” 
* “Yes, I've always been that way until recently." (con- 
cern) " h 
v “You're disturbed because you find yourself changing.” 
E es. I never had trouble—I did my job—335 days of 


action—that’s a long time. Fellows broke every day. 

Nothing happened to me until two days after the war was 
s x ” 

Over. ‘Then my face began to twitch. 


C. «е5 awfully confusing to find that you’ve changed so 
. much especially when you bore up so well in action." 
"e Veteran who was “so rational” was freed to re-live 
€ eee which had resulted in deep guilt and bitter- 
wh E € described a physical attack he made on one of his men 
‚© broke down during an amphibious combat maneuver and 
ups c Seguent shame at a base hospital when he realized Row 
dm Such men really were. He told of ordering a man to ы 
be ee In his fox hole and of seeing the man killed as he жаг 
wounds command. He talked of the heroism of a fatally 
qued buddy and the silver stars awarded “to colonels for 
кш Over a body of water.” Near the end of the discussion 
Said, “Pye never told this to anyone—they don’t under- 


hand; and Pye felt that I should keep it to myself; but I 


elie 
Ve that’s part of my trouble.” 


teen Contact could have been closed without the d is 
s bul to the point of expressing these pent-up fee с t 
сл be pointed out that he brought his conflict - e прел 
attitud, the counselor let him select his own tests an ERD ore 
ation es that were stimulated to expression by the жез Speer 
able to In his second interview he reported that he had been 
0 concentrate on his studies for the first time. 
Tue evaluation of this methodology must await the exe- 


Cuti е 
“п of research studies of the type outlined in the next sec- 


370 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


tion. However, we can describe experiences with two clients 
that we feel are typical of what can be obtained. 

The first example is that of a twenty-one-year-old girl who 
came to the Bureau after working in an industrial plant for tw? 
years after graduation from high school. She initiated the 
interview by stating that she had decided to go to college ап 
was thinking quite seriously about medical technology. £ 
said she wanted to take tests to find out whether that field 
would be a good one for her. She amplified this as the discus- 
sion continued to include the feeling that there was little secur 
ity in her present job with so many servicemen about to return 
and, furthermore, that she was dissatisfied with the job in M 
case. After she had more or less exhausted this discussion $ 1 
stated clearly that she felt tests would help her. At this ро 
she and the counselor began to discuss the various tests. ' i 
the description of tests of general learning ability, she 5216». — 
think I would be better in the common sense learning situa 
(Non-verbal intelligence tests general population norms elli- 
I’m in now than in the book-learning situation (verbal ee 
gence test). I have trouble concentrating on anything T 
After that she decided that she would wait until after 4 clu- 
tests were described before she picked out any. At the co? she 
sion of the counselor’s discussion of the tests, she said put. 
thought she had better take the tests related to her fe rol" 

€ t 
lem was really how she felt about things. She indicate erio 
later the other tests might be helpful. By this time t e т, she 
allotted for the interview had come to a close. king 0 
seemed to have developed so much impetus toward wor mer 
her problem that it appeared difficult for her to stoP- She 
tioned that she thought a course in psychology mig 
to gain a better understanding of herself. Then, 
getting up to leave, she remarked that she was teac 
day school class and found it difficult to be patient 
students. ld 

| The second example is that of a twenty-three-ye27^ ac 
viceman who stopped in while on furlough. He oe ed fro 
was looking ahead to the time when he would be ше 


ht 
as 5 un^ 


ing 2 
hi E dh the 


TEST SELECTION 371 


Service and felt that he should work out his vocational plans 
Prior to that time. He talked of his vocational plans in terms 
of college training. His orientation toward taking tests was so 
explicit that the discussion turned in that direction almost im- 
mediately. During the discussion of the tests of general learn- 
IDE ability, he was given a prediction of his probable achieve- 
Ment in college based on his rank in his high-school graduating 
class and his percentile score on the American Council Exami- 
nation, taken four years ago at the time of graduation from high 
School, The prediction was that the odds would be against his 
emng a satisfactory college student. The discussion passed to 
‘he aptitude tests without his having chosen any test of general 
ability, During the discussion of aptitude tests related to 
Mechanical performance, he mentioned that after graduation 
Tom high school he had attended a fine arts school and had 
cen very interested in landscape gardening. In the service he 
Was à crew chief in the Army Air Corps and liked this mechani- 
cal job and felt competent in it. He expressed the feeling that 
€ Was being tugged in two directions by his civilian and service 
“Xperience, He decided to take a comprehensive series of me- 
+ anical aptitude tests. In discussing the achievement tests, 
ürther expression by the client was touched off by the mention 
Ne Cooperative General Mathematics Test as a good basis for 
Prediction of achievement in the Institute of Technology. He 
n considerable time talking about his reaction to high 
at the, Which was that he felt he had эше e крй 
tı i titude to 
W and ата E е au. e compensating for his 
Previous е е усту Seated that he thought his mathe- 
aties рош He also indicated that he th FERE 
avera ackground would prove to be i und ds i: ha 
à colle Student, After some reluctance he e c n 5. 
his & 8 aptitude test and asked for tests which would comp 
шшр ability with that of the average гип of people. 


Hypothesized Results of Procedure 


1. 
abo, It has been illustrated in the excerpts and cases p 
im е that a situation develops in which the client brin r 
“nals relating to his feelings and his history in à way which 


Wi s hi i 
y! nable him to understand their significance more readily. 


resented 
gs forth 


372 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The second case, particularly, illustrates how this procedure 
obviated the necessity for probing in order for the appropriate 
tests to be decided on. If the counselor desires a wealth 0 
diagnostic data, he will seldom be disappointed. Furthermore, 
and this is of considerable significance, these data are given 
spontaneously. It is quite possible that tests can be more bah 
ful toward the closing stages of a counseling process. Their 
place in the client's thinking will probably determine this. ТЕ 
first case is an instance in which the client as a result of ia 
insights, determines the stage in the counseling process in whic 
tests should enter. 

2. The procedure facilitate the development of a deep?! 
understanding of the problem. The first case is a particularly 
clear example of an instance in which the process of discussio? 
of the test judgments resulted in a radical restatement 0 E 
problem. This procedure is a source of more efficient counse™ 
ing where students are drawn into the Bureau with a mista 
idea of the amount of information that may be obtaine dis- 
tests. In our experience there are many clients who, after © 
cussion of the judgments available from their high-school a 
and college aptitude tests and the additional judgment? . ar 
could be obtained from taking more tests, came to the ra 
tion that they were seeking a degree of certainty and врео as 
ness in judgment that was not obtainable and that dex 
their only reason for coming to the Bureau. 

3. This procedure fosters an active role for t 
an early recognition of his responsibilities in the € 
process. 

4. As the client takes tests, he is aware of th 
for him of his performance on each test. This pre 
make use of this opportunity to observe himself. 
tests selected by this method, many times clients as ° 
to the counselor with their attitudes considerably pt will 
result of this observation of themselves. Further, С „гат 
be more motivated to submit to an extensive testing paet 
when they have participated in the process of choosing 1° 


an 
clin£ 


he client 
ouns 


ce 
10 
pares " 
After t? n 
come ba 


«fca 
e signific? 


Needed Research podol” 


Discussion and description of new counselin 


TEST SELECTION 373 


would not be complete without its research implications. 
Manifestly, when a specific method grows out of clinical experi- 
ence, it can be said to have been validated by the observations 
of its authors. However, it must be recognized that such vali- 
dations can be only considered private demonstrations of the 
validity of the method and that the requirements of science 
call for public demonstration. This means that studies must be 
made which will demonstrate the validity of the method not ` 
only to the satisfaction of the authors but to the satisfaction 
of others. 

At this stage it is not possible to report any single study of 
the effectiveness of client participation in test selection, but a 
number of types of studies can be suggested. These are listed 

elow: И : 

l. One highly significant study would compare the degree 
of acceptance by clients of their test results under conditions 
ot client participation and under traditional conditions. Do 
clients accept adverse test results more readily when counseled 

y са, method than by апу other? ` bol wee 
-U i iti he new method or the tradi- 
tional singe eas неет i active responsibility for 
Worki a iride ir problems? One would 
ng toward the solution of their p 
*XPect this characteristic of activity to be evidenced by the 
amount of spontaneously volunteered information, the number 
oF new directions of self-exploration initiated by the client. 
One question which may trouble some counselors is that 
sj the frequency with which the suggested method will result 
n failure to collect important test information for the complete 
ie Study. It should be possible to compare methods in terms 
the number and appropriateness of tests chosen. i M 
taki, Which method facilitates a more positive cadis e (ee 
t 115 tests on the part of clients? Data may be obtaine 
gree of resistance of clients to take responsibility for 
Ing tests. Further, it would be necessary to find out under 
ис Conditions they exhibit more definite interest and cooper- 
inet їп the test-taking process and their attitude ее Es 
Pro necessitate electrical recording of interviews, * | 
cess of the interview 18 essential to this type of evaluation. 


Choos 


DATA REGARDING THE RELIABILITY AND 
VALIDITY OF THE ACADEMIC 
INTEREST INVENTORY 


WILBUR S. GREGORY 
University of Nebraska 


Tur Academic Interest Inventory was developed by the 
author during the period from 1938 to 1941. The present form 
of the test was used experimentally in September, 1941, when 
a Was administered to the freshmen who matriculated at the 

Niversity of Nebraska. The data presented in this paper are 
ased on that experimental administration of the test in 1941. 
b Work on the Inventory was suspended shortly after Decem- 
€t 7, 1941, and the Inventory has not been published for gen- 
eral use, It probably will be published in 1946. 
he Inventory consists of twenty-eight scales at present. 
e Was designed to measure interest in specific “areas” that 
hake up the curricula of colleges and universities. Its use will 
Probably be limited to college students. It was developed for 
se by college and university counsellors in conjunction with 
*Ptitude and achievement tests, in order to: 
us Aid students in the selection of the college curriculum in 
ne ich they will specialize, i.e., in choosing between the Engi- 
ering College, the College of Business Administration, the 
€achers College, the College of Arts and Science, the Agricul- 
ture C , 


ollege, etc. Wot: Bae? 
and ‹ Aid students within a college in selecting their “major 
Ў 1 . B B . . 
minor" areas of specialization, 1.6. 1n choosing between 


as mistry, Geology, Home Economics, History, Sociology, etc., 
majors" or “minors.” 
ele : Aid students in the selection of specific courses and 
Ctives 


4. Aid counsellors in evaluating failures and “problem” 
315 


376 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


cases. Many students fail courses, in spite of the fact that they 
possess the abilities and prerequisite training necessary for the 
work, because they lack the necessary interest to apply them- 
selves to the courses in which they are enrolled. 

The twenty-eight scales which make up the Inventory Were 
developed by the statistical procedures used by Dr. Е. 
Strong in developing the weights for the items in the revision 
of his Vocational Interest Blank. A minimum of .100 seniors 
and juniors who were specializing in a specific department, such 
as Mechanical Engineering, Architecture, Chemistry, etc» MU 
used as the "criterion group" for developing weights for E 
items in the scale. These criterion groups were secured throug 
the cooperation of the heads of departments in a number ° 
universities, including Purdue University, Syracuse University» 
Northwestern University, Oklahoma University, and the p 
versity of Nebraska. The development of most of the x 
would not have been possible without the cooperation © E 
heads of the departments who administered the experimenta 
form of the test to their senior and junior class majors: at 
subsequent paper will list these men by name in order to 10 
knowledge the author’s gratitude to them. That paper W, d 
outline in detail the procedures used in developing the ES 
for the Inventory and the scoring weights for the items: jon 
author also wishes to acknowledge the extensive contribut a 
of Mr. H. M. Cox, Director of the Bureau of Instruction 
Research, the University of Nebraska, in supervising the sco. 
of the tests and the machine and statistical work invo а nd 
tabulating the scores and computing the means, sigma " 
rs used in the tables in this article. 

The twenty-eight scales which are include 
tory are listed in Table 1. The Inventory consist 
of 300 items. For each item the examinee marks th 
ate space on the answer sheet to designate one of th i 
тепе tib the item: very interested, mildly intereste pi hts 
тегеп, mildly disinterested, or very disinterested. The wr "be 
for scoring the items range from —4 to +4. The items "t n 
Inventory consist of topics studied or operations perfor 
various classes, such as: 


n 
$ Inve 
in the 

: ѕ ol a qr? 


THE ACADEMIC INTEREST INVENTORY 377 


Study the history of architecture. 

Determine or test the “hardness” of water. 
Play deck tennis. 

Study principles of design of women’s clothes. 
Repair farm machinery. 

Translate Latin texts. 

22. Dissect the brain of a sheep. 


WN UW NH e 


The data presented in the following paragraphs are the 
result of preliminary analysis of the scores made by the men 
and women in the freshman class at the University of Nebraska 
m September, 1941. More detailed and thorough studies will 
be published in the future. 

For the correlations and comparisons reported in this paper, 
Scaled scores were used rather than raw scores. The scaled 
Scores provide a nine-point scale in which each of the nine 
Points represents approximately one-half sigma. Each of the 
nine points of the scale represents the following percentage of 
the total distribution: 

Scaled Score Per Cent of Distribution 
Highest 3% 
% 


° 
12% 
18% 
Middle 20% 
18% 


коко ч нол Суза ооо 


Lowest 3% 


Test-Retest Reliability 

In order to determine the reliability of the twenty-eight 
Scales which comprise the Inventory, the 7 was computed for 
басһ scale between initial test and retest scores. Ninety-nine 
Teshmen, selected at random from students in the class which 
“ntered the University of Nebraska in September, 1941, were 
ded as subjects. The initial test was administered to the 
entire class, The ninety-nine students whose scores were used 
= Compute the test-retest reliability were given the retest two 


? three months after the initial testing. 
he mean and sigma of the distributions of scores for both 
the first testing and retesting and the coefficient of correlation 
“tween the initial and retest scores are presented in Table 1. 


378 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 1 
Results of Retesting on Gregory's Academic Interest Inventory 
N=99 
First Test Retest 
Mean с Меап с 
Agriculture T RTT 4.08 2.21 | 3.87 2.04 
Architecture . . | 498 1.98 | 4.61 1.89 
Biological Science ...... | 5.02 1.68 | 5.06 1.86 
Business Administration . | 5.22 2.00 | 4.84 1.99 
5.23 2.20 | 5.65 2.16 
3.61 (3.21)* | 2.37 | 3.54 (2.65) | 2.47 
3.29 (2.45) | 2.41 | 3.31 (2.78) | 2.44 
481 229 | 4.82 1.98 
4.15 2.33 | 3.96 2.18 
4.07 234 | 3.82 2.22 
4.14 2.30 | 4.02 2.23 
5.87 1.90 | 4.47 1.88 
5.08 2.10 | 4.97 2.12 
4.64 1.89 | 4.57 1.76 
е 5.60 1.91 | 5.23 1.99 
MON voies РАЯ 5.46 2.00 | 5.03 1.99 
Ноте Есопотісѕ ...... 3.86 (3.45) | 2.60 | 3.67 (3.13) | 2.46 
Journalism ............ 5.16 2.15 | 4.98 2.12 
2.18 | 4.82 2.14 
198 | 5.23 1.95 
2.11 | 427 1.99 
1.96 | 4.72 1.90 
1.83 | 491 1.89 
211]. | 5:21 2.04 
2.23 | 5.06 1.94 
1.70 | 5.02 1.98 
2.27 | 4.80 2.12 
2.00 | 4.76 2.05 


b 
Fj * е mean 
* Figures in parentheses denot i it was thought that the 
unreliable e medians where it was g 


It will be noted that the means of the distributions f0" bog 
the initial testing and retesting tend to be scaled scores © г to 
or 6. In other words, the students used as subjects appe? ont 
be "typical" of the freshman class rather than limite 
interest group. ‘eld d 
It can be noted that fourteen (one-half of the scales) Уі 


: age? 
test-retest 7's of + .90 or higher. These scales аге: Lang" 936; 


r = * 949; English, 7 = +.942; Mechanical Engineering; 7 ^ туе 
Speech and Dramatics, r---.933; Physics, 7=+ 254i b 918; 
Economics, r = + .925; Agriculture, r = + .920; History; 7 A engi 
Elementary Education, т = + 916; Music, r = + .916; Cr 9013 
neering, r=+.912; Geology, 7=+.911; Journalism, i 
Sociology, 7 = + .900. 


Å= ee Ой 


THE ACADEMIC INTEREST INVENTORY 379 


| Thirteen of the remaining fourteen scales yielded test-retest 
"s between + .897 and +.816. These 7’s are sufficiently high to 
Justify use of all twenty-seven of the scales. 

А The one scale whose reliability can be questioned is that for 
Interest in Business Administration. The test-retest r for this 
scale was +.691. Although this r is not low enough to justify 
discarding the scale for Business Administration, the counselor 
Who uses this test should keep in mind the fact that it is defi- 
nitely less reliable than the other twenty-seven scales which are 
included in the inventory. 


Evidence of Validity Found by Comparing Scores of 
Students Enrolled in Different Colleges 


. Evidence of the validity of the scores was found by compar- 
ing the mean scores on each scale of the students enrolled in 
certain curricula at the University of Nebraska. It is assumed 
that matriculation in a particular college in the University 
(College of Arts and Sciences, College of Business Administra- 
tion, Agriculture College, Engineering College, and Teachers 
College) could be used as a “group” criterion of validity. This 
Criterion has obvious limitations and weaknesses: some stu- 
dents enroll in a college even though they are very uncertain 
of their educational and occupational goals; some students 
Matriculate in a college with misconceptions regarding the cur- 
riculum of the college (for example, students have enrolled in 
Engineering school who have strong aversions to Mathematics 
and Physics); but the greatest weakness is to be found in the 
fact that the interests of students in a college are by no means 
Omogeneous (for example, in the College of Arts and Sciences, 
Some students are primarily interested in Fine Arts, others in 
Curnalism, others in the Sciences, others in Social Studies, etc., 
With definite aversions to other courses which are included in 
the curriculum of that college). Such weaknesses in this cri- 
terion of validity would tend to lower evidence of validity. 
Onsequently, what evidence of validity for the scales can be 
'Scovered by using this criterion may be regarded as significant. 
In Table 2 are presented the means of the scores of fresh- 
Теп men and women in each of five of the Colleges of the 


380 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


University of Nebraska in the class that matriculated in Sep- 
tember, 1941. Mean scaled scores of 1, 2, or 3 would be signifi- 
cantly below the mean scaled score of 5, and mean scaled scores 
of 7, 8, and 9 would be significantly higher than this theorett- 
cal mean. 


TABLE 2 
Mean Scaled Scores of College Groups on Gregory's Academic Interest Inventory 


Men | Women 
Ag. | Ass. B.A. Eng. | T.C. Ag, | A&S. | В.А. TO 
Agriculture ........ 67 | 425 | 46 | 545 | 42 | 285 | 215 | 24 1 
Architecture...) 525| 42 |52 | 565] 455] 46 | 405 | £05 | 365 
Biol. Science оз... $$ | 605 | 435 | 5.2 | 5.05] sos] 57 | 41 | $1 
Business Admin. ... | 54 | 405 | 70 |43 | 575] $6 | $6. | 61, | 12 
Chemistry .. 1605559 155 |61 |505 | £4 | 455 | 425 | 25 
Com. Arts... 2...) 215] 23 | 3.05} 105] 31 |52 | 40 | 53 | 23 
Elem. Education 1.55139 [20 | 2 |31 |495 | 41 |45 | oss 
Secondary Educ. ... | 3.75 | 43 |405 |29 |26 | 635| 53 |57 | 9's 
ivil Engineering .. | 5.15 | 405 | 485 | 60 |42 |28 | 23 | 28 |1 
Elec Engineering .. | 5.35 | 43 | 44 |645 |395 | 24. | 23, | 20 |07 
Mech, Engineering . | 5.3 | 42 | 43 |625|39 |25 | 23 39 | 3.35 
Pub. Serv. Engin. .. | 495 | 435 | $15 | &8 | 44 | 36 | 34 42 | 60 
356 | 46 155 |33 148 | 54 | 5.9 | or | 5.35 
38 | 405 | 4.2 |345 | 4.15 | $75 | 53 |33 | 3.95 
Geology 635| 58 |555|68 | 51 |44 | 44 $8 | $$ 
History .... 46 | 48 |52 |32 |53 | 52 | 53 % |52 
Home Econ, 18 |23 |195 10 |27 |58 | 46 | 48 |38 
Journalism . 385] 41 | 505]32 [47 |54 | $$ |64 |62 
Languages .... 3.7 4.6 41 |27 |49 [5.95 | 60 2i 4.05 
Mathematics 5.99 | 545 |54 |605 |55 | 435] 41 | 4) | 2.65 
Military Science ... | 55 37 5.05 | 5.6 | 43 2.7 24 29 6.23 
ir NN 425 | 44 |as |31 | $7 | 605] 56. | 355 | 5.25 
Phys. Education ... | 545 | 5.15 | 54 38 |605|54 | 475 4.15 4.05 
Physics ........ мәз 61 |57 |55 (735 155 |635| 425 | $g | 60 
Psychology ........ 10 | 49 | 425 | 3.25 $05 | 58 | 605 |р, | 39, 
Religion .......... 5.65 | 56 | 51 | 5.05 | 53 | 495 | 475 | 25 64: 
Sociology ......... 36 | 435 | 135 | 285 | 475 | 605 | 60 | 25 | 6. 
Speech and Debate . | 3.7 42 4.35 |26 |49 |59 5.8 а 
ests 


An examination of the data presented in Table 2 su£& 


the following evidence of validity for the various scales: 6.7) 


l. Agriculture Scale. The highest mean scaled ws ough 
was that of the men in the College of Agriculture. the 


all four groups of women scored means of 2.85 or ap. 
women in the College of Agriculture had a higher all 
on this scale than those in the other colleges. 

2. Architecture Scale. The highest mean (5.65 
of the men in the Engineering College. 


t 
) was tha 


et m^ uv. 


THE ACADEMIC INTEREST INVENTORY 381 


3. Biological Science Scale. The highest mean (6.05) was 
that of the men in the College of Arts and Sciences. The 
Women's group that had the highest mean score (5.7) was the 
Arts and Sciences group also. 

4. Business Administration Scale. The highest mean score 
(7.0) was that of the men in the Business Administration Col- 
lege. The women in the Business Administration College, with 
à mean score of 6.1, were higher on this scale than the other 
Women's groups. 

5. Chemistry Scale. The highest means were those of the 
men in the College of Engineering (6.1) and the men in the 
College of Agriculture (6.05). 

6. Commercial Arts Scale. The highest mean scores were 
those of the women in the College of Business Administration 
(5.3) and the women in the Teachers College (5.3)—women 
who may be preparing for business positions or for the teaching 
of Typing, Shorthand, and Bookkeeping. 

hs Elementary Education Scale. The highest scores were 
made by the women in the Teachers College (5.8). This is the 
only mean score above 5. 

6. Secondary Education Scale. The highest mean scores 
Were those of the women in the Teachers College (6.55) and 
the College of Agriculture (6.35). The latter is significant 

cause a large percentage of women in the College of Agricul- 
ture prepare themselves to teach Home Economics. It should 
€ noted, however, that the men in the Teachers College had the 
9West mean score (2.6) of any of the men's or women's groups 
9n this scale, They are a small group made up primarily of 
athletes (the highest mean score of the men in the Teachers 
ollege was in Physical Education), so they may be primarily 
Mterested in participation in sports rather than in teaching. 

9. Civil Engineering Scale. The highest mean score was 
that of the men in the College of Engineering (6.0). 

10. Electrical Engineering Scale. The highest mean score 
Was that of the men in the College of Engineering (6.45). 

11. Mechanical Engineering Scale. The highest mean score 
Was that of the men in the College of Engineering (6.25). 

12: Public Service Engineering Scale. The highest mean 


382 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


score on this scale was made by the men in the College of 
Business Administration (5.15). In view of the fact that the 
engineering scales correlate negatively with the Social Sciences 
(see Table 3) and that Public Service Engineering training 
includes courses in the Social Sciences and in Business, this 18 
not invalidating. ‚ 

13. English Scale. The highest mean score оп this 5° ү 
was that of the women in the College of Business Administ” 
tion (6.2), with the women in the Teachers College and t e 
College of Arts and Sciences close behind (6.0 and 5.9). 
validating significance of these means is not clear. be 

14. Fine Arts Scale. None of the college groups "m. 
used as a criterion group for interest in the fine arts. ea 
women in the Agriculture College scored the highest e 
(5.75) and their course does involve interest in dress design 
and other aspects of art, but no special significance er. 
attached to their mean or to that of any other of these 510 che 

15. Geology Scale. The highest mean score was cha 
men in the College of Engineering (6.8) and the men т десс 
Agriculture College (6.35). These two means proba ly 'G ol- 
the weight of a general scientific interest factor althoug^ піса) 
ogy should be of interest to agriculturists and to che! 
petroleum, civil, and other engineers. hose of 

16. History Scale. The highest mean scores Wer” 18) p 
the women in the Business Administration College ^^ olleg? 
women in the Teachers College (5.5), the women in the 7 jleg? 
of Arts and Sciences (5.3), the men in the Teachers ollege? 
(5.3). History is included in the curriculum of all par: men 
except the Engineering College, and the mean score for 
in the Engineering College was the lowest (3.2). ore i 

17. Home Economics Scale. The highest mean 2. we 
that of the women in the Agriculture College ‘°° ‘pov’ 
women in the Teachers College also had a mean score é pich 
(5.2). All of the men’s groups scored means below thre arest Jn 
would be expected in view of the sex differences ™ i: 


ale 


j f 
Home Economics. E 


рау, 
18. Journalism Scale. The highest mean score en 63) 
the women in the College of Business Administra" 


THE ACADEMIC INTEREST INVENTORY 383 


This may reflect the fact that many of the women in that col- 
lege at Nebraska University are interested in advertising as a 
career, 

19. Languages Scale. The women in the Teachers College 
have the highest mean score (6.2) on this scale, reflecting the 
tendency for a large percentage of language students to prepare 
for teaching. It is to be noted that the women in the College 
of Arts and Sciences rank second and that the men in the 
Teachers College and College of Arts and Sciences scored the 
highest means of the men’s groups. 

20. Mathematics Scale. The highest mean score was that 
of the men in the Engineering College (6.05). 

21. Military Science Scale. None of these college groups 
can be used as a criterion group for this scale. It is to be noted 
that all four of the women’s groups scored means below 3. 

22. Music Scale. The highest mean score was that of the 
women in the Teachers College (6.25) as should be expected 
because most of the music majors at Nebraska prepare for 
teaching. The women in the Agriculture College also scored a 
mean above 6 (of 6.55). 

23. Physical Education Scale. The highest mean score was 
that of the men in the Teachers College (6.05). This is to be 
Expected in view of the high percentage of athletes who enter 

eachers College to major in Physical Education. ) 

24. Physics Scale. Тһе highest mean score was that of the 
men in the Engineering College (7.35). The men in the Agri- 
culture College scored a mean of 6.1. 

25. Psychology Scale. The highest mean was that of the 
Women in the College of Arts and Sciences (6.05), with the 
Women in the Teachers College a close second (6.0). The men 
in the Teachers College and Arts and Sciences College ranked 

igher than the men in the other colleges. These means may 
© evidence of validity since most psychology students are in 
Ose two colleges. 

26. Religion Scale. None of these college groups serve as 
a уа idating group for this scale. The men and women in the 
Sere College scored higher means than their respective 

lege groups, which may indicate a more conservative ten- 


en : : 
cy in the Agriculture students. 


SS ee a 
STE Sa SU ù 
Каур EL 
770 мочу SU 
Mops SL 
Uto зэка YL 


“HPT “SAYA ET 


5 е INN! TL 
E t ч - DUNG "WW LT 
йы {е | FO Ir ee 6С зэуешәцу ү OZ 
2 Fe C" IS i vs " soden3uv бү 
Ы Uc Bh E š 80 
S 91 9р 79 - 19 - 6$ = usqeumof gy 
ш gs wW W= | 99— gT- `+ cuo»q әшоң ZI 
s 8r Spe дүе epe 65" (e| 9B | god eem коч OT 
a [UA rx SE— | I~ SE ee ‘ Кдојоәс) cT 
5 g-|* |97 | 85 kn- 6c Uoc suy әш +] 
5 LS IiE | e- | He se en we ee рш er 
Fi ^ RU per | Ee 89 1с 6t ` uug "A18 “qnd ZI 
= se se 9P 80° 6L - HA 09° tct culug YPN TI 
5 re | СЕ | BF I8 ZS- ; В . cuiu зә OL 
Э] F = SI oF шыл! 
© =” | fem | BU Uu А її 10 I9 |7 *ouldug [ar 6 
© [rA 8Y- | 25 6s Ig" "- . pgo | 55 UC Onpq XS $ 
A 9g 19 0c е | @ = It- : oc- t onp WALT Z 
9 oa 8s LO" ке B zE oz- |“ * suy porwuo) 9 
о OF VE Ha I LIES gp- oF Cae ie Utt nsu) $ 
A = | s£— pos A Ls 3 ic- g0- | * unupy ssoursng p 
a sU ET 9c WS | 1с- - If , soouamg org ¢ 
a TO йан i a oF ү = 0 {С : “+ amNY Z 
А9 T 07 TE d a . e 5 
4 f-[|m-le |S [ie | 60 - SF SET 
d TH £I eI IT 01 6 9 £ 1 
< 
A ы |ы zilszisgleileisielels|s|zl|lze 
о a @ E E 8 sz З 8 3 8 2. 2 S. aA 
E E H F m p E = R 5 В. Я n zm E 
< > E e HE 5 Isi E = 8 Fi 8 o. 8 a 
5 a 3 & S, | gm 8 S 2. 4 > Е E a 
A ti EN d а ? ©. 3 8 e 
© B E! 5 
F = : 

+ 
co 
eo 


(27902 211 fo fj 4vjn3uvi] 112] 42210] әү] UL 
21025 мэш IY} 10] 270ү] рир 2]qvi 21] fo fjow apm3uviu] түзы sadan 911 ИІ 210 121027 S UNUON эү] о] ға IL) `бшотиәти] ayy Зизыфшогу 
paid 13410 ayı fo эру ynm &ioqu2aug S949] эїшәръзү s &403247) шо 2]v38 4207 40f (2ро= №) мәшод{ ри» (сб = М) uapy о} 5.4 Јо 219011, 
© WISVL 


385 


THE ACADEMIC INTEREST INVENTORY 


08 9r Sr SE 8c 89 Ig sc - 7g" 

T9 or TZ TE | ce es Ig] 6L 6L 
y0 Ic t= £0 £L tr Fo — || S84 ИЖ 
8 Lo 80° 6g-— | 0071 TC 6E= | (02> cs” 
T= | OF || BE st £C = | € or 19 
сг og gL Г 9r os” <=] 99 44 
TS ДЕ £g Sr w= 07 ££'- | OF 69" 
Eye Ге | $E 007 | 00 Р | ££ ^ 20 £p-— 
9r- | 40 | 21087 A d [LN OL 8 = TI 
9 FEN TC 8c све | BE oF Z= | TO 

19° tr 8r PA e e| = rg £0" > 8t 
ГАД 95" 60° FEN 9r- | СЕ Or T0 = | ge 
#5' 95 IS Ie CL | OF 68 ozr- 1 OF © 
6t Lic Se Or | 99 6c so Sl 9r 
H4 pe Se 6r Oor- | $C 8s 6t- | £F 
LS 9T ££" Sr Z€-— | OF S9' oe |) LE 9s 
IE 80° ГА т | Fe 59" 85 W| 82 6r 
a | ee | YE С A Sr 60°- | 8I 19 yt- 
g~ iS sT S0 — $9 00° LC- | £v 6C Up 
Ше | St-— р 0S 9c - tV о L0 c0 19 80 - 
LN so 0£ 1$ 1с - | os сс a=) Er YS 
OL 19 [nu oS -— | 8С 8r SO" gr LS. 
S or T6 TC 95-7] I$ or eT} 90 LY 
69 - oF | Bp 9p | Oe 9c 8$r-| f0- | 65 ТЕ 
It SU 80° £0 - Se=] 80° 6C £t. 70 - ГА! 
ер Oro cr [23 cs Ir 60-| бЕ— | FF 1 ы 
SU 00° Sg К^ [ra I 15" c= | 09 E 
89 [= Is 00 — £9 IF pe | Si £o TE 
sc | ze | 9c | s | | ce |с |i |o |é 
n n "X ч 2 = Et 
с егу уш р ув 

T 5 2. a т Ө; 3 E] 

Е E- = WE ГЕ 
= Б R E 


95 


0889 € 93 


coco 
RSS 
І 


с 
KAO 
I 


oo e 
a RS 
1 


Usreuiof 


"7 pads 82 
К80[0120$ /@ 
~ Toiy 92 
АЗо|оцоАзд SZ 
"7 soishd yo 
` npg “shud £c 
e aisnp ZZ 
* e»uemg [UA ТС 
soneumui]pW 02 
+++ sa3enguvT бт 
*'* uieuinof 8] 
=e 11099 swo AI 
í * £103S1E] 9T 
А8оүоәгу GT 
Uc suy әш] PT 
eee ци er 
"bg “Ag “Ng ZT 
7 шия YN II 
ttt шаст 9p OT 
"uud AO 6 
опр 9g $ 
vivum С “мәң di 
suy үшзләшшогу 9 


ss Кпешәцо 5 
* "unupy ssoursng y 
" soouosng "org ¢ 

энцзәзцәлү C 
tt 9anmnondy Т 


(penunuo)) £ ATAVL 


386 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


27. Sociology Scale. Although the women in the Business 
Administration, Teachers and Agriculture Colleges scored the 
highest means (7.0, 6.45, and 6.05), the validating significance 
of these means is not clear. The means of the women's group? 
were higher than any of the means of the men's groups. 

28. Speech and Dramatics Scale. The women in the 
Teachers College scored the highest mean (6.45), which may 
reflect the tendency for most speech majors to prepare of 
teaching. The means of all of the women’s groups were higher 
than the means of any of the men’s groups. 

In summary, the data presented in Table 2 contai 
for validity of the following scales: Agriculture, Architecturo 
Biological Sciences, Business Administration, Chemistry» Сопы 
mercial Arts, Elementary Education, Secondary Education 
Civil Engineering, Electrical Engineering, Mechanical E. 
neering, Geology, Home Economics, Languages, Mathemat™ 
Physical Education, Physics, Psychology. 


n evidence 


Intercorrelations Between Scores on the Various Scales 
in the Academic Interest Inventory 


Using the scores of 793 men and 462 women who mati 
lated in the freshman class at the University of Nebraska 2 
September, 1941, the Pearson coefficient of correlation Was M 
puted for each scale with each of the other twenty-seven € 
Ваша it was thought that sex differences or а masculi? € 
femininity factor might affect the intercorrelations, t 
computed separately for the men’s and for the women 
The two sets of 7’s are presented in Table 3. 

Inspection of Table 3 also reveals that the 7’s for t 
on any one of the scales tend to be in the same direction 
tive or negative) and size as the 7’s for the women on i" jor 
It appears that physical sex difference may not ne 
factor affecting the pattern of intercorrelations. Ho f thet 
masculinity-femininity factor may be involved, an 
study of this problem will be conducted. itive T 

It may be noted from Table 3 that: the highest pr tot”! 
for the men was + .87 with only 36 7’s above .70 (om а d with 
of 378 r's); the highest negative r for men was ~-/ di 


—— 


С 


THE ACADEMIC INTEREST INVENTORY 387 


no other negative r’s above – .70; the highest positive r for the 
women was +.80, with only 17 7s above + .70 (out of a total 
of 378 rs); and the highest negative r for the women was – .72, 
with no other negative 7’s above —.70. In view of the high test- 
retest reliability of these scales, this lack of high 7’s found in the 
table of intercorrelations indicates that these scales are measur- 
ing independent variables to an extent sufficient to justify the 
use of each scale. That is, no two scales correlated so highly 
that it can be said that they are measuring precisely the same 
variable. 

However, pending a factor analysis of the Inventory, there 
are tendencies for certain types of scales to yield intercorrela- 
tions which contribute to the evidence of validity of the scales: 

l. The scales for various scientific courses (Engineering, 
Agriculture, Mathematics, Chemistry, Physics, Biology) tend 
to yield significantly high positive 7's. There are several ten- 
dencies which point toward validity of the specific scientific 
Scales. For example, the 7 between the Architecture and Civil 
Engineering Scales was + .64, but the Mechanical and Electrical 

"ngineering Scales yielded 7’s below + 45 with the Architecture 

cale; the Chemistry Scale yielded much higher 7’s with the 
"ngineering, Physics, Mathematics and Geology Scales than 
With the Biological Science, Architecture, and other scientific 
Interest scales, 

2. The Architecture Scale yielded r's between + .60 and + .76 
With Public Service Engineering, Civil Engineering, and Mathe- 
Matics Scales and yielded much lower rs with the Physics, 

echanical Engineering, and Electrical Engineering Scales. 

3. The Biological Science Scale yielded its highest 7's with 
the Chemistry and Geology Scales and did not correlate as 
highly with the Engineering Scales as did the Physics and 

athematics Scales. l ae Н 
_ 4. The Business Administration Scale yielded its highest 7’s 
With the Journalism, Commercial Arts, and Speech Scales, 
Which represent courses more closely related to business inter- 
“sts than the other scales. 
„5. The highest т for the women on the Elementary Educa- 
Hon and Secondary Education Scales was between those two 


388 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


scales (r=+.72). This was not true of the г for the men, 
however. 

6. There is a strong tendency for scales which measure 
interest in studies that are heavily weighted in 'Thurstone 5 
Verbal Ability Test to yield significantly high intercorrelations: 
These Scales include English, Journalism, Languages, History» 
Sociology, Speech, Psychology. Р 

7. The Home Economics Scale yielded its highest r wit 
Secondary Education Scale (a large percentage of Home 
nomics majors become teachers). . 

8. The Journalism Scale yielded its highest r's wit 
English, Speech, Business Administration, Fine Arts, Sociology» 
and History Scales. Р 

9. The Psychology Scale yielded its highest 7’s with 
Education and Sociology Scales. 


h the 
Есо- 


h the 


the 


Sex Differences on the Twenty-eight Scales А 
Strong § 


Most of the standardized interest tests, such as es 
Вегепс 


Vocational Interest Blank, have yielded distinct di demi 
between the scores of men and women. The present P nean 
Interest Inventory also yields distinct sex differences iP pat 
scores on the various scales. These sex differences const! 
some evidence of validity for certain ones of the su sex 
addition, the counsellor who uses the scales should keep ™ 
differences in mind. 

In Table 4 are presented the mean scores of men (N 
and women (№ = 462) on the various scales as well as 06 
ference between these means, The scales are presente the 
Table 4 in the rank order of the size of the difference, ee toP 

' greatest difference scored by women over the men at s scores 
of the list, and the greatest difference of men’s mean 
above the women’s at the bottom of the list. , 

The sex differences aid in validating the scales in t rt 

1. Mean scores of the women are significantly highe ly 0 
those of the men for courses in which women are exclusi” 
primarily enrolled, namely, Elementary Education Scale, 
Economics Scale, Commercial Arts Scale, Language? 
Speech Scale, Sociology Scale, English Scale, and the " 
Education Scale. 


- 793) 
dif- 
jn 


hat: 


калабы E a m ZEND E ERE me 


THE ACADEMIC INTEREST INVENTORY 389 


TABLE 4 


Differences Between Mean Scores of Men and Women on the Twenty-eight 
Scales in Gregory's Academic Interest Inventory 


(№ = 462 Women, 793 Men) 


Mean Scores 


_—— — Difference 
Men Women 

Education, Elementary ........ es 1.99 4.94 3295 

Home Economics ...... 2.28 5.02 +2.74 

Commercial Arts . 2.46 4.94 +2.48 

3.74 6.16 +2.42 

3.72 6.14 +2.42 

3.85 6.10 +2.25 

3.92 6.01 +2.09 

3.94 5.98 +2.04 

4.04 5.85 +1.81 

4.12 5.84 +1.72 

3.77 5.35 +1.58 

4.18 5.75 +1.57 

4.37 5.43 +1.06 

4.94 5.02 + .08 

5.02 4.94 - .08 

5.17 4.84 - 33 

5.19 4.72 - 47 

TChitecture ...... т 118 2 is 
engineering, Public Service 497 3.62 5 

о = 416 $$ -167 

hemistry and Chemical 6.11 3.99 -212 

БҮ олем 608 393 3215 

hysics ..... 6.10 i "s 
ilitary Sci 4.94 65 -2. 

y Scienc £00 232 2248 

Bricültüre s asss aini 20 е NA 
engineering, Mechanical 04 noe a 

Engineering, Electrical «scene $00 219 -281 


2. Mean scores of the men are significantly higher than 
those of the women on the scales for those courses in which men 
are exclusively or primarily enrolled, namely, Electrical Engi- 
neering Scale, Mechanical Engineering Scale, Agriculture Scale, 
Civil Engineering Scale, Military Science Scale, Physics Scale, 
Geology Scale, Chemistry Scale. 

3. Those scales for which the difference in the mean scores 
Was less than 2.0 scaled score units are primarily for courses in 
Which both men and women usually enroll. 

4. It should be noted that the negative 7’s in Table 3 tend 
to occur between the scales at the extremes of the list in Table 

» indicating that the masculinity-femininity factor strongly 
affects the intercorrelations of the scales for both men and 


390 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


women. For example, the men’s scores on the Elementary 
Education Scale and the Electrical Engineering Scale yielded 
an 7 of —.53 and the women's scores on these two scales yield 
an r of —.41. 

Summary 


The present article presents some preliminary statistical 
data regarding scores obtained on the author's Academic Inter- 
est Inventory. The inventory consists of twenty-eight scales; 
each measuring interest in an academic department or “cur- 
ricular area.” All of the scales yielded significantly high test- 
retest coefficients of correlation with the exception of the sca’ 
for interest in Business Administration. Preliminary evidence 
of validity for the various scales has been presented in addition 
to a table of intercorrelations between the various scales a” 
data on sex differences. 


| 


| 


A SCALE FOR MEASURING PSYCHOLOGICAL 
CHANGES DURING MILITARY 
SERVICE: 


H. M. HILDRETH 
Lieutenant Commander, H(S) USNR 


Tue scale'described in this article is the outgrowth of a 
study of sailors and marines returning from combat areas in the 
Pacific. The adjustment difficulties of such men create a prob- 
lem both in clinical diagnosis and in the administrative handling 
of disciplinary infractions. Some of their reactions are tempo- 
Tary, some are not. In evaluating the significance of their be- 
havior and appraising their psychological state it is important 
to know how they have changed, or feel they have changed, 
as a result of military experience. The scale described here is 
ап attempt at objective measurement of these changes. 

There is at the present time a notable lack of psychometric 
devices for the measurement of human change. Personality 
Inventories and similar instruments attempt to measure only 
the stable and well-established personality characteristics, and 
are essentially static in nature. About the only possibility of 
Measuring change has been the comparison of current perform- 
ance оп a personality test with previous performance. Even 
this method has never been feasible from a practical standpoint 
since previous test results are seldom available in clinical work. 

he scale described here is a step in the direction of filling this 
Вар, and is essentially an experiment in the measurement of 
PSychological change. 

Of the many possible approaches to the problem the one 
Chosen here is the simplest. The individual is asked directly 

9w he has changed since entering military service. The cate- 


writ 1The opinions or assertions contained in this article are the private ones of the 
m and are not to be construed as official or reflecting the views of the Navy 
artment or the Naval Service at large. 
391 


392 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


gories provided for him are empirically derived and ee 
the spontaneous descriptions of themselves given by me 
the course of clinical interviews. | ЕЕЕ. 
Many of the thirty questions in the scale, derived ipis ria 
self-descriptions, will be recognized by the student е Н 
psychology as verbal-social stereotypes, and as such, of «а it 
value for the purpose of measurement. In this connecti таре 
should be noted that stereotypes, in addition to their advan 
of familiarity, have an altered significance in time of ve 
man in service does not feel it necessary to shoulder E» e a 
responsibility for the unfavorable characteristics he adm hting 
garding himself. This fact largely cancels the social und 
of stereotypes and removes one of the chief objections S sd 
At the same time, the residual reluctance to characterize ee 
unfavorably serves as a social grid, and willingness to ero s indi- 
the barrier and acknowledge non-approved characteristic 


cates a positive conviction. hod of 
а з "Am etho 
Given below is a description of the scale, the m rather 
B . H W Š 
scoring, and a few experimental results which sho fferentiate 


remarkable way in which the scale appears to di 
clinical groups. 


PSYCHOLOGICAL-CHANGE SCALE 


nge 

А м уе сһа 

Instructions: These аге questions about how you ha 

since you have been in the service. Check each one. СнАмСЕ 
More Less No 


І. Do you feel that you have be- 
come more ambitious, or less 
ambitious? =, eet 

2. Are you inclined to be more 
moody, or less moody?  ..... 

3. Have you felt more thwarted 
or held down than before, or 
less so? 

4. Since coming into the service 
are you inclined to be more 
cheerful, or less cheerful? — ——..... 

5. Have your experiences made 

you more hardboiled in your | 

attitude toward others, or less, d 

hardboled? — — aa pers OT 

"as your period of service 


gx 


SS 


10. 


ti; 


12. 


13. 
14, 
15, 
16. 
17. 
18. 
19. 
20. 
21, 


22, 
23. 


24, 


MEASURING PSYCHOLOGICAL CHANGES 


given you more of a feeling of 
inferiority, or do you feel less 
inferior? 
Do you tend to get angry more 
easily than you did before, or 
less easily? 
Do you feel more regretful and 
Sorry about things that have 
happened to you, or do you 
feel less sorry? 
Are you more self-confident 
Since coming into the service, 
or less self-confident? 
Are you inclined to be more 
disgusted with things in gen- 
eral, or less so? А 
Do you tend to be more opti- 
Mistic in your viewpoints, or 
less optimistic? "m 
Do you feel that your life in 
the service has made you more 
dissatisfied, or less dissatisfied? 
re you more happy, °F less 
happy? 

ге you more restless, Or less 
restless? Р 

ave you become more soti- 
able, or less sociable? 

© you feel more able to take 
responsibility, or less able? 

о you feel more independent, 
or less independent? 

о you feel depressed more 
often, or less often? f 

о you feel more tolerant О 
other people, or less tolerant! 

re you more critical of others, 
9r less critical? ; 

о you tend to be more easily 
annoyed by people, ОГ less 
easily annoyed? 

1.9 You worry more often, 
€ss often? 

о you resent being told what 
to do more than you 4! be- 
fore, or do you resent it less! 

an you concentrate and keep 
your mind on what you Te et 
Ing more easily, or less eas! y* 


or 


More Less 


393 


No CHANGE 


394 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


More Less No CHANGE 


25. Do you feel more cooperative 
toward others, or less coopera- 
tive Cea cise cippus Ee 

26. Do you criticize yourself more 
often than you used to, or less 
often? о  -* xs Tc 

27. Do you have more patience, or 
les patience? a es 

28. Do you feel tense and keyed 
up more often than you used 
to, or less often? — i ee 

29. Do you have more persever- 
ance, or are you unable to 
keep at things you're doing? ..... 

30. Do you have more and wider 
interests now than you used to, 
or are your interests less wide? ..... 


Scoring 


All items on which a patient has indicated change; 
marked More or Less, are tallied according to the at hs 
key: Мовв—1, 4, 9, 11, 13, 15, 16, 17, 19, 24, 25, 27, 2” 
Less—2, 3, 5, 6, 7, 8, 10, 12, 14, 18, 20, 21, 22, 23, 26, 28. cor- 

The number of items on which an individual’s answer ally: 
respond to the answers on the key is designated his hang’ 
These are items on which he has indicated favorable c ub- 
The “w” tally, showing unfavorable change, is obtained У iher 
tracting "f" from the total number of items marke core 
Моке or Less. If, for example, 10 of a man's апав. 15 
spond to the key, "f" would be 10; and if he had sai LESS 
items as indicating change of some sort, either MORE о 
his “w” tally would be 15 – 10, or 5. mbine 

From these tallies three scores are computed: а Сапе 
score, a Degree-of-Change score, and а Direction-° 
score. 


"on thos? 
Jlowins 


C = (100) 6 

fru 
Dg = (100) 30 
Dr = (100) =" 


MEASURING PSYCHOLOGICAL CHANGES 395 


The C score represents quantitatively the degree as well as 
the direction of change. The other two scores break down the 
C score into its component parts, the relationship of the three 
scores being С = Dg.Dr. The C and Dr scales run from — 100 
to +100; Dg runs from zero to + 100. 

Except in the case of psychotics and mental defectives, few 
Questions are ordinarily left blank. When this occurs, however, 
adjusted C and Dg scores may be computed by using for a 
denominator the total number of questions answered instead 
of 30. Such scores are not strictly comparable to unadjusted 
Scores but for clinical purposes they are useful. They are best 
Not computed at all when the denominator is less than eight. 

Occasionally items are omitted because the individual is not 
Sure of the meaning of the key word even when it is explained 
tohim. Little difficulty has been encountered in this respect 
to date, since most of the patients examined entered the Navy 
or Marine Corps when educational standards were high. Con- 
Servative practice, however, would exclude the use of the scale 


With subjects of borderline intelligence. d 


Standardization 


Validity The Psychological-Change Scale is designed to 
peasure the changes an individual feels have taken place in 
eee: during his military service. On this subject there 18 

o authority but the man himself; and inasmuch as the scale 
Questions the individual directly there can be little doubt as to 
'ts validity, и 

It is well to note, however, that there are two types of inter- 
Pretation, neither lera which may easily be made in using the 
Scale if the limits of its validity are not kept in mind. In the 

181 place the scale cannot be said to measure how a person 
А changed, but only how he feels he has changed. A man’s 
П Conception of what has happened to him does not neces- 


8 i Ж: 2 
my Coincide with the opinions of others. Full clinical ap 
а of an individual requires consideration of both points 


as a but the scale confines itself to measuring the changes 
t B 
SY appear to the man himself 


Second distinction which should be kept in mind in using 


396 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the scale concerns the causal interpretation of results. Changes 
taking place in an individual during military service cannot be 
interpreted as necessarily due to military service. It seem 
likely that most changes actually can be attributed to services 
directly or indirectly, because military life involves such an 
extensive control of the individual's environment. At the sam? 
time independent factors, chiefly maturation, may account far 
some of the observed changes, and consequently а causal inter" 
pretation of results cannot automatically be made. Conclu- 
sions regarding cause and effect can come properly only rr 
further studies, and for such research the scale can be used а 
an instrument of investigation. 

Reliability. —Repeated administration of the sca 
ing intervals shows that the consistency with which a К 
answers the questions depends on two factors: his length ne 
service, and his mental condition. In a survey of 250 "P. 
psychiatric patients, the reliability coefficient was found to seio 
greatly with the time interval. Those with a psychopat er 
reaction show little change in a month’s time. 
hand, those who are acutely disturbed or in a stat few 
mental flux show noticeable differences after a period of For 
weeks. This is particularly true of the Combat Fatigues: , he 
these patients the scale is still reliable, for repetition © 95): 
scale in from two to six days shows great consistency (ro с to 
Changes shown over a period of weeks or months pone 5 
reflect actual changes which have taken place in the pe 
mental state. 


le at vary“ 
patient 


Clinical Findings 
Preliminary results from the Scale indicate thet ervic® 
groups are affected in quite different ways by military t 349 
The data following are based on the responses jc, HONS 
patients, including a Control group of 95 non-psychiat™ ines 
disciplinary patients. All of the men were sailors 07 m had 
undergoing treatment at a naval hospital. Most of зе 


* > iori a 
been in the service two years or longer, and a majority > poup’ 


various 


ae а . t 
overseas. No significant differences existed € excPt T 
В : s duty: im 
in regard to age, length of service, or oversea a had less 0 


the case of the epileptics who were younger an 
in the service. 


MEASURING PSYCHOLOGICAL CHANGES 397 


T LN and 2 are given the means, sigmas and critical 

Suns, in e Eie Scale scores for various clinical groups. The 

biden a er of listing, are: Disciplinary (with psychiatric 

pais excluded), Epileptic, Control, Constitutional Psycho- 
ic State, Psychoneurotic, Fatigue. 


TABLE 1 


Means and Standard Deviations for the Three Scale Scores, 
for Various Clinical Groups 


| 


Ne. af Means Sigmas 
са с Dg Dr G De Di 
72 +6 4) b 34 33 60 
26 Amm 0 ж? 29 20 24 
95 -16 53 -3 2.1 и 
61 -29 61 -48 27 27 43 
47 -50 72 -70 22 22 28 
48 -51 9 -55 a 15 3 

TABLE 2 


Criti 
c В s А à Р 
а! Ratios for Differences in Mean Scores, for Various Clinical Groups 


(Dif/SE dif) 


Combined Score 


F 
9.70 
6.38 
* 6.09 
3.89 
0.18 

Degrec-of-Change Score 
CPS PN F 
4.03 6.34 12.00 
0.38 2.66 7.60 
1.69 4.20 10.50 
MN TE 233 7.86 
e p MERE T ELIT 5:39 
Direction-of-Change Score 

C CPS PN F 
Pen 835 
e 707 
FR in 
248 


398 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Taking as a criterion a critical ratio of 2.58, it can be seen 
that the various clinical groups differ significantly from each 
other in all but a few cases. When two groups fail to show 
differences in one score they invariably show differences 11 
another score. Epileptics for example do not differ sign 
cantly from the Control group in total impact or degree 9 
change, but do show a difference in the direction of change 

It is interesting to note that disciplinary cases feel that t ey 
have changed favorably since entering military service, an 
that the epileptics fall midway between the Disciplinary 2 
the Control groups. The CPS group, all of whom were diag- 
nosed as Emotional Instability or Inadequate Personality; ® 
in reality not a psychopathic but a personality-disorder 
True psychopaths have not yet been tested in sufficien 
bers for results to be reported, but preliminary data 17 
their C and Dr scores will be farther up in the positive 
at the disciplinary cases and their Dg score will be extreme 
ow. 

y The Fatigue group, which includes both Combat and M 
tional Fatigue, shows a much greater Degree-of-Change К: 
the Psychoneurotic group although the Direction-of-Chan£* re 
not nearly so unfavorable. With this notable exception ae. 
is a tendency for unfavorable change and degree of € ange 
parallel. 


n 


group: 
t num- 
dicate 
range 


Non-Military Applications 


n 
of n? 
ade yet- 


Although no extensive investigation has been m 
military use of the scale, it has been tried out with а "57 ipce 
erans, using the instructions, “How have you chang?” veld 
you got out of service?,” and with civilians using 2167. 
“How have you changed during the war?" and “How ha 
changed during the past two years?” Responses appe the 
parallel in range and variety those obtained from va lue і? 
service, and suggest that scales of this type have ? "aient 
appraising individual reactions to any major event gr 
period in a person's life. 

‘cal 
psycho! 8 is 
Jearly as 


Comment 


It is not too surprising that a scale measuring 
changes should differentiate reaction-patterns as С 


MEASURING PSYCHOLOGICAL CHANGES 399 


scale appears to do. In the clinical study of the individual an 
understanding of the psychological state he is in at the moment 
18 no more important than a knowledge of the direction in which 
he is moving. For the understanding of his past and the pre- 
diction of his future behavior no information is more vital than 
that which concerns the way he has been changing. Any mea- 
Surement of these changes, no matter how limited, is such a 
Clinical aid that it is quickly appreciated and utilized by those 
doing clinical work. At least this has been the reaction of 
Naval psychiatrists to whom patients’ scores have been made 
available, 

The scale is by no means an ideal instrument. As stated 
Carlier, it is an initial attempt in the measurement of change, 
confined to a specific group and using for its purpose only one 
2 many possible approaches. Its limitations are apparent 
when one considers the extensiveness of the field in which the 
attempt is made. Its usefulness in spite of its preliminary 
ature is encouraging, and is evidence that continued research 
in the measurement of psychological change will be rewarding. 


Summary 


. i 0- 
1. Presented in this article is a scale for measuring psych 
gical changes in the individual during military service. m 
DI The scale fcn experiment in the direct телшш ЫШ 
po Chological change, in contrast to the measurement ої $ 
Syc olo g icti 
gical characteristics. Direc- 
ts. Three scores are computed: Degee ee n 
‘°n-of-Change and Combined score which takes into 
th the degree and the direction of the кое atients illus- 
reliminary results on 349 naval hospita E 


i i various 
e way in which the scale differentiates among 
&roups. 


AT utstanding among these re 
EDEN Shown by the Fatigue group, 
“aange characteristic of disciplinary cases. TUNES ons 
Best * The possibility of non-military use of the 
e . Жз 9 Є À 
м чаша сере . Hildreth, Н. M., “А Battery of 
еер апо see: 


her h to this problem, Clinical Psychology (in 
Pressy@@nd-Attitude Scales for Clinical Use- Journal of 


trate 
ate th 
с inic 
| sults is the high degree of 
and the favorable direction 


| 
| 
THE PERSONALITY OF ARTISTS 


ANNE ROE 


on the : 
twent Armes process (2), personality studies were made of 
to Fe eading American painters. The sample was limited 
e à à 
5, 38 to 68 years of age, resident in or near New York, 


and nati 
ative- : : ; : 
tive-born, or residents of this country since their early 
most of the major 


I Yale University 
| N THE 
| HE course of а study on the effects of the use of alcohol 


te 
| — E Pi so selected as to include 
Stract, sien of painting: traditional, romantic, realist, ab- 
Selected as rn, surrealist, and social painters. It was also so 
as to include men who could be classed from very 


Mode 

rat : ; 
i e to very heavy drinkers. This may have somewhat 
he incidence of severe mal- 


e it to be representative in 
s vocational group. 


la 
Mie, d sample with respect to t 
is е but in general І believ › 
Ет. - of the successful members of this voc гош 
Interviews rsonality studies are based on material gathered in 
9f two ns M study of the work of the man, and on the results 
Perception — tests, the Rorschach and the Thematic Ap- 
iscussed ; est. The technical aspects of these test results are 
iscugg HS some detail elsewhere (3). Here it is proposed to 
Practice e results generally and the implications for testing 
and interpretation. For greater simplicity the two 


test . 
| € be discussed separately. 
€ Rorschach method was easily administered to all but 


г few of them were compliant 
T к than interested, to most of them it was an amusing task. 
ing жь exception was very disturbed at the time. Outstand- 
a in d the results is the fact that there is no personality 

etero common to the group which is, 1n fact, extremely 
geneous both with respect to the total picture and with 


Spec 

t Atm 

to the use of individual deter 
401 


minants. 


402 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The general adjustment level of the men, as reflected by the 
total score on the Munroe Inspection Technique (1) was e 
varied, total scores ranging from 3 to 18, with a mean of 10.3. 
This measure is an extremely satisfactory method for ато 
analysis, and corresponds very closely with the clinical ert 
of maladjustment, higher total scores indicating more sever 
degrees of maladjustment. This measure is not available M 
other adult male groups so far as is known, with the exception 
of a group of vertebrate paleontologists who were given g 


: es 
Group Rorschach and whose Inspection Technique pH 
ranged from 1 to 15 with a mean of 7.7. In college dicato 


Munroe estimates that scores of 10 or over are likely to 7 


6 : : н : lege 
sufficient maladjustment for difficulties to appear 1n the соё 


situation. ms of 
Detailed results are most easily summed up 1n e. 
" И ‚ соп 

choice of location (whole, detail or space responses); colo 


responses and determinants (human movement, form, ding 
etc.) used. In the use of locations, the most consistent ү 
in the group was the common tendency to increased num 
of whole responses. Seventeen of the group gave more Е: thar 
20-30 per cent considered average and only one gave ES рег 
this. In addition, 5 of these subjects had more than bers of 
cent of unusual details, and 7 Һай unusually large p se oF 
space responses. There were 5 whose succession was 169 
confused, that is whose use of different location ares 
erratic and without system. 
One striking situation appeared in the content о — | 

sponses. This was the number of anatomy and sex aed i 
which, even taking into consideration the general sophis re wer’ 
of the group in these respects, was extremely high. pu 
only five in the group who did not show a noticea 


f che 2 


in this type of response. { ete" 

A few group tendencies can be seen in the use T à th? 
minants, but no tendency was shown by all member ith 
group. The per cent of form responses tended to be eve? 


К е (0) 
especially high nor especially low. It was supreme ge us 
that 7 of these artists were noted to have made €* 
of poor or vague forms. 


PERSONALITY OF ARTISTS 403 


и Shading shock, usually mild, was noted for 12 in this group. 
Six of the group had more than 20 per cent of Fe or form- 
shading reactions, which is abnormally high. 

There were 2 men who gave only one human movement re- 
Sponse and 2 who gave none. In addition there were 3 whose 
human movement responses were restricted, either in terms of 
Preference for parts of the body rather than the whole, or in 
terms of marked passivity of the movement seen. Both animal 
Motion and inanimate motion were sometimes excessive. 

Color shock was present in all but two of these subjects (ac- 


cording to Munroe’s criteria which include milder degrees than 


Most). Interestingly enough, the two who did not show it were 


the two with the lowest and highest Inspection Technique 
Scores; in the latter its absence is a rather serious indication. 
ight of these men gave none or only one form-color response, 
and 5 gave excessive numbers of color-form responses. 
. Again it should be emphasized that there is great variation 
in the group, but a few general comments may be made. Asa 
ole, quantitative analysis shows these men to be character- 
dby above-average intelligence, unusually great use of whole 
responses, marked prevalence of color and shading shock, and 
Overproduction of responses of sexual content. 4 
5 In addition, but less generally, there is some overproduction 
o Space responses, some use of loose succession, frequent use 
Vague or poor forms, diminution in the use of human move- 
Ment Tesponses with a tendency to excessive movement 1n 
8eneral, and underproduction of form-color responses with 
| ОУе-ауегаде production of form-shading cesponsce: Pro- 
eed search, however, failed to disclose any “signs whose 
Presence indicates capacity to function successfully as an artist. 


ize 


iss 
s E: E 
of lon this is a very striking fi 


t 
han my own. 


404 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


For this reason the protocols were submitted for blind 
analysis to Dr. Bruno Klopfer. His only information about the 
subjects, other than age and sex, was that they were profes- 
sionally successful. For obvious reasons, he was not asked 
specifically about their creativity. His comments amply con 
firmed my own opinion. It was impossible to delete from t Я 
protocols of two оЁ' the men remarks that made it apparent 
that they had some connection with art. Klopfer noted these 
and added that one was probably a successful creative artists 
but that it was likely that with the other it was an avocation 
since it was so improbable that he could be successful at it 
professionally. Of the others, he remarked in 5 instance? that 
creative ability was evident (limited in one of the five an 
probably not usable by another because of neurotic conflicts?” 
In 5 others he remarked its absence, and by inference he a 
marked its absence in 3 more. Of the remaining 5 he made k 
comments which would indicate an opinion one way ОГ another 
but it is obvious that he was not struck with the present? Р 
such ability. 


did show “creative” ability. Creative ability, then, may ёх e 
without being shown in the Rorschach (or we m : 
some indications of it but not others). The alternative E. 
sibility’ is that one may be a successful artist in OUT soc! 
without having creative ability. The two hypotheses are 
necessarily mutually exclusive, and certainly there are no 

cient data at hand to suggest that one is more likely t° 
than the other. It seems extremely important, howeves: 
recognize that, whatever the explanation, we are in no 
to say to any subject on the basis of performance о. ме” 
Rorschach that he is incapable of becoming a successfu pa re 
In view of these results it would seem highly desirable * he 
examine our theories of creativity and to examine, 
precise function of the artist in our society. pift 


TEIL TUR a 
. The possibility that part of the difficulty is the logical fallacy у case t 

middle term must be considered; it may be a factor, but it seems in а 

a minor one here. 


— - 


PERSONALITY OF ARTISTS 405 


Qualitative analysis also revealed the presence of consider- 
able similarity in members of this group with regard to the 
nature of their sex development. This was confirmed by the 
Thematic Apperception Test findings and will be discussed 
following discussion of other results on that test. 

The Thematic Apperception Test was extremely difficult 
to administer to this group of men (it was given to 18 of the 
20) because of the fact that they were, without exception, so 
appalled by the poor quality of the pictures, artistically speak- 
ing, that they had repeatedly to be be recalled from critical 
comments to the task at hand. This reaction was sufficiently 
Strong that interpretation at some points is difficult. For ex- 
ample, there is generally great curtailment of time reference, 
attention being largely limited to the immediate moment, with 
disregard of the past and of the future. It may be legitimate 
to interpret this at face value, but it must be considered that 
lt was possibly influenced by their critical attitude and by a 
Wish to be through with the thing as quickly as possible. In 
&eneral, too, they characteristically ignored details, but again 
one cannot be sure of the interpretation. It is likely that this 
objection would not be met with in other groups, at least to 
the same extent, but it is unfortunate that it should enter at all. 
"Pa probably, however, the protocols can be largely а 

ed at face value, with only moderate limitations. 
8eneral, the information which can be derived from them nicely 
Supplements the Rorschach material and supplies leads to the 
development of the personality structure seen in cross section 
'^ the Rorschach. . 

It is difficult to discuss group performance on this test, but 
Some group analysis has been made. The great curtailment 
9! time reference has already been mentioned. There are not 
Many unusual stories in the group, although most of the men 
Шү in an additional unusual twist here and there. ome were 
ston acceptable” stories, in Rapaport’s meaning pde pes 

ries of homicide, suicide, etc This is not many hens 
н Stories (only 10 cards were used for each man). po 
е У опе man who told stories unrelated to the picture. : а 

е there was little out of the ordinary 1n the content © 


Stories, 


406 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


A list of formal characteristics drawn from various sources 
was made, and the stories were analyzed from this point of 
view. Eleven of the subjects tended to “overspecification” of 
events reported in the stories, and 9 occasionally overgeneral- 
ized. Nine introduced personal judgments, i.e., expressions 
of approval or disapproval of the indicated action. Seven sub- 
Jeets referred to events in their own lives which the cards re- 
minded them of; this was most often in response to Card 1, an 
was probably a way of feeling out the situation. Seven intro- 
duced non-existent figures into the stories. Rapaport considers 
this a serious indication and it may be, in general; in this group 
it most often occurred on Card 5 where I suspect it is of less 
import. Seven of them referred to Card 4 as a movie, ballet, etc. 
This may reflect a tendency to wish to shy away from strong 
emotional situations of a particular type, which would certainly 
be in accord with the picture of the group as a whole. On at 
other hand this is perhaps the “cheapest” card in the group 3” 
this interpretation may largely be a reflection of this judgmet 

There were a few very common perceptual disorders. р: 
gun in Card 3 was frequently misrecognized ог omitte ae 
the story. This accords with the generally non-ageress™ 
character of the group which will be discussed below. VU? thi 
card, also, the figure was most often taken to be that pde 
woman. In fact only two of the men took it as that of 2 1 
without any hesitation and both of these had difficulty det 
mining the sex of one figure in Card 10, which also caused di 
ficulty to others. The implications of this are in close 
with implications about sexual development which aP 
in the Rorschach analyses. ity 

Almost all of these men, whatever their gene б 
structure, seem to have a type of social and sexual аса 
which is of a markedly non-aggressive sort, and hence 
more “feminine” than “masculine” according to our ce A of 
stereotypes. It is important to remember that this pb 
development has not precluded either vocational suec 
success in social relations, even though many of pher any 
have some difficulties with the latter. At the same tim? ^5 j 
of them in spite of their unaggressiveness have persev ards 
their vocation in the face of severe economic and social ha” 


accor 
peare 


PERSONALITY OF ARTISTS 407 


There is no overt homosexuality in this group and the latent 
homosexual trends are not generally excessively strong. All 
but one of them are married—a number of them have married 
Several times—and nine of them have children. It is perhaps 
Pertinent that most of them are married to professional wo- 
men, artists, singers, dancers, who probably have an analogous 
sexual development. 

One problem is whether this type of adjustment is uniquely 
characteristic of this particular vocational group. It seems 
clear that this is not so; it has often been remarked, e.g., that 
Such an attitude is characteristic of physicians, and it is my 
impression that it is also characteristic of scientists, and, in 
fact, generally of the sensitive, intelligent man who follows 
More or less intellectual pursuits. How important a factor 
this may have been in determining the choice of a vocation is 
not known. It is very possible that intellectual pursuits have 
become a refuge for men who do not follow the culture pattern 
In this respect and whose deviation from it is of this sort. 

In many respects this pattern seems a richer and socially 
More desirable one than the “frontier” type which well repre- 
Sents the pattern which seems to be culturally accepted and 
Which clearly lacks a number of social and spiritual values 
found in the other. Nevertheless, when it is considered that to 
à large extent our social ideals are developed by the men who 
follow intellectual pursuits, if only because they are in a bet- 
ter position to express their thinking adequately, and that to 
à considerable extent our politically active men seem to be 

Tawn from the aggressively masculine type, it is obvious that 
Serious difficulties are inevitable. Further, a man whose own 
Personality does not contain some freely usable aggressive ele- 
ments is not equipped to deal, even across а council table, with 
men whose major adaptation is basically an aggressive one. 

It would be well worth while to study our cultural stereo- 
types of male and female emotional development and the 
actual distribution of these types in our society. It is not 
Certain whether we have in fact one or several abstract stereo- 
types. To maintain as an abstract cultural ideal a single 
type from which a high percentage of persons deviate is to in- 


408 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


sure a high incidence of neuroticism. To maintain several ill- 
defined and overlapping ideals, accepting one type in some 
groups or situations and different ones in others, necessarily 
introduces misunderstandings of a profoundly serious nature. 
Studies of personality as related to vocation and social status 
are urgently needed as an aid to an understanding and eventua 
solution of many social problems. 


REFERENCES d 
1. Munroe, К. “The Inspection Technique: a Method of ee 
Evaluation of the Rorschach Protocol.” Rorschach 
search Exchange, VIII (1944), 46-69. J urnal 
2. Roe, Anne. “Alcohol and Creative Work.” Quarterly Jo 
of Studies on Alcohol, VI (1946), 415-467. search 
3. Roe, Anne. “Painting and Personality.” Rorschach Re 
Exchange. In press. 


MEASUREMENT ABSTRACTS* 


Bent 
ce fne L. and Probst, К.А. “A Comparison of Psychiatric 
сна kom Minnesora Multiphasic Personality Inventory 
. or 7 
(1946), ge i of Abnormal and Social Psychology, XLI 
o = 
as des i pul psychiatrists rated 76 patients on personality trends 
Were given 2 the Minnesota Manual. Subsequently the patients 
results sh the Minnesota Multiphasic Personality Inventory. The 
and the = significant agreement between the psychiatric ratings 
АНД | et соге with respect to Psychopathic Deviate, Paranoia, 
to EIE rena; No significant agreement was found with respect 
; * 1 : As 
еи оне Hysteria, Femininity, and Psychas- 


Blair 
oe and Clark, R. W. “Personality Adjustments of Ninth- 
uta vos as Measured by the Multiple Choice Rorschach 
a п „Ше California Test of Personality.” Journal of Edu- 
lo H. sychology, XXXVII (1946), 13-20. 
Californi тоша i nekian Multiple Choice Rorschach Test and the 
Pupils med est of Personality were administered to 382 ninth-grade 
Son Prod correlations and analyses were made of the results. Pear- 
answers” uct Moment correlations between the number of “poor 
answers” on the Rorschach Test and the number of “undesirable 
Individu m the California Test are low but statistically significant. 
° als designated as the Maladjusted Rorschach Group made 


t 
аот атаве а higher number of “undesirable” responses to the 
epresenti est than did the total group tested. Biserial correlations 
by the R ng the relationship between maladjustment as measured 
the Са; orschach and as measured by each of the 12 components of 
finus ifornia Test are in 6 cases statistically significant, and use of 
Total Adjustment 


es $ 
ame statistical procedure shows the scores on Adjt 
djustment to be significantly 


T 


o 1 у 
related Self-Adjustment and Social A 
conclud to maladjustment as measured by the Rorschach. It is 
.uded, however, that none of the relationships between the scores 


h to indicate that the tests 


tained 
he same aspects of per- 


€asur, 
e 
“onality, 


high enoug 


from the two tests is ht 
light extent t 


to more than a very $ 
Frances Smith. 


Technical Education.” Part І. 
1 Psychology, XVI (1946), 


formation 


Bra 1 
dford, E. J. G. “Selection for 


ү British Journal of Educationa 


20 31 
3 s and tests aimed at gaining in 


T = 
— a study of technic 
dited by Forrest A. Kingsbury- 
409 


410 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 
which might lead to the more accurate selection of pupils for a tech- 
nical education. The author focuses attention on the need to con- 
sider the nature of the abilities which are likely to respond best to a 
technical curriculum, or the type of curriculum best suited to those 
with a particular type of test ability. The analysis of the abilities 
was done by means of Burt’s Summation Method. Irene 
Robinson. 


Carlson, Jessie J. Psychosomatic Study of 50 Stuttering Children; 
III: “Analysis of Responses on the Revised Stanford-Binet. 
American Journal of Orthopsychiatry, XVI (1946), 120-126. ic 
One of a round-table series treating stuttering as a psychoneuro" 

manifestation having important psychosomatic aspects, this tells " 

matching for age, sex, and IQ 50 stutterers from speech clase eg 

New York City's Board of Education with 50 children from ki en 

of the Board's Bureau of Child Guidance. Both groups had did 


given Form L of the Stanford-Binet. While the matched group e 
d o proble 


ferior to the others in general verbal ability, their percentage bal 
cess on most items being slightly higher; in handling no DM 
material they were definitely inferior to a degree approac hing 
tical reliability. Vernon S. Tracht. 


—— e 
Darcy, Natalie T. “The Effect of Bilingualism upon the Meas ral 
ment of the Intelligence of Children of Pre-school Age- Jo 

of Educational Psychology, XXXVII (1946), 21-41. : Man 

Two hundred and twelve children from nursery schools 19 jnto 
hattan and Brooklyn and P. S. 97 in Manhattan were divide 
bilingual (Italian-English) and monolingual groups of 106 my s (a8 
each, matched as to age, number, sex, and socio-economic мад 1937 
determined by parental occupations). Both were gi or! 
Stanford-Binet and the Atkins Object-Fitting Tests. The a t 
ance of the monolingual group surpassed that of the bilingua ance 
Stanford-Binet by statistically significant scores. 
of the bilingual group was significantly superior t $ u 
on the Atkins Test. It is concluded that bilingual subjects 6 not be 
a language handicap and that although the Atkins scale саг 
substituted for the Binet, both measure the same functions ee 
extent. Esther Litwak. i 

sto" 
Tur nal 

е 


Edwards, A. L. and Kenney, К. С. “А Comparison of the 

and Likert Techniques of Attitude Scale Construction. 

of Applied Psychology, XXX (1946), 72-83. nd * 

. The Thurstone method of Equal-Appearing Intervals evel as 

Likert method of Summated Ratings were studied comparat son 

techniques of scale-construction, employing as a basis for со үе 

the original statements of opinion used by Thurstone an de 10 a 
the construction of their scale designed to measure att! 006 


MEASUREMENT ABSTRACTS 411 


i church. Separate scales were independently constructed from 
oie by 72 members of an introductory psychology class, and 
ШЕ с of two other psychology classes were then presented with 
С scales in counterbalanced order, for the purpose of obtaining data 

n their reliability and comparability. Results indicate that it is 
Possible to construct scales by the two methods which will yield com- 
parable scores, that scales constructed by the Likert method will yield 
eee reliability coefficients with fewer items, and that the Likert 
*chnique is less laborious. Frances Smith. 


Fleming, E. С. and Fleming, C. W. “A Qualitative Approach to the 
Problem of Improving Selection of Salesmen by Psychological 
Tests? Journal of Psychology, XXI (1946), 127-150. 

S SIX paper-pencil tests, Bernreuter Personality Inventory, Moss 

Ocial Intelligence Test, Washburne S-A Inventory, Otis Self-Ad- 

Ministering Higher Examination of Mental Ability, Canfield Test of 

ales Knowledge, and Strong Vocational Interest Blank, were ad- 

ministered to 583 men representing 12 companies. From this battery 
Sub-test scores were available and the pattern of the individual's 

Performance was studied in relation to the specific job requirements. 

er Predicted efficiency was then compared to the sales executives 

а ‘mates of actual accomplishment. On these data a tetrachoric 

Srrelation of .49 and a chi-square of 8.15 were found. Francis F. 

Tedland. 

SS 
Sarrety, Henry E. “The Effects of Schoolis Upon 1.0.” Psycho- 
gical Buletin, XLIII (1946), 72-76. 2 » 
pene article by Irving ке ХЕТ Makes а Difference, 

"А hers College Record, XLVI, 483-492, is examined with reference 

exten: УО conclusions which it implies: (1) the more recent an 
*nsive a person's education, the better he is likely to perform on 


te H : А 
А Haas words and numbers, and Сор ne IQ 

~ author o ticle concedes the first conclusio 
P the prec d in comparing data for 


ер еп 
the Mate, though statistical procedures use pete att 
t 


an iginal in his opinion 

study make the results, in his opinion, т 

tng conclusive. The second conclusion bye а 

in e vi ioni of the terms M.A. 

1 y the evidence, questioning Use i r 

subj Paring group ce individual test scores, and Ri to 
Jects who аге beyond the age of 16 years. Frances Smith. 

uc 


Teaching Efficiency.” Journal of 


Gotha © 
m, R. E, “Personality and (1945), 157-165. 


Experiment ion; XIV ; » 
Ben de Mii im was to determine: 1) The relations 
able SD a teacher's personality and her ability to produce meat.” 
face changes in her pupils; (2) The interrelationships ашды Г 

ПС measures of personality; (3) The predictably а 
те През from a composite of personality measures. Four " | 

aching ' access me used: (a) Five teacher rating scales; 


412 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 
Thirteen tests as measures of qualities associated with teaching suc 
cess; (c) Three batteries of tests as criterion of pupil changes A 
composite of the foregoing. Results showed no significant difference 
between the personality inventory scores and the criterion of pup! 
change. Some relationship was found between the criterion © рор! 
change and the teacher rating scales. А multiple correlation OF · 
was found between a composite of teacher personality measures an 
pupil change. Also significant was the lack of agreement foun 
among the several criteria of teaching efficiency. Betty Steele. 
— — "m 
Guttman, Louis. “A Basis for Analyzing Test-Retest Reliability: 
Psychometrika, X (1945), 255-282. are 
. Three sources of variation in experimental results for a a 
distinguished: trials, persons, and items. Unreliability 18 de a 
only in terms of variation over trials. This definition eads t n 
more complete analysis than does the conventional one; Spear was 
contention is verified that the conventional approach—whic? em- 
formulated by Yule—introduces unnecessary hypothesis. It 1s Е 
phasized that at least two trials are necessary to estimate, the wer 
ability coefficient. This paper is devoted largely to developing “put 
bounds to the reliability coefficient that can be compute ү 
a single trial; these avoid the experimental difficulties of ma таей, 
independent trials. Six different lower bounds аге estab casi! 
appropriate for different situations. Some of the bounds are s 25 


to compute than are conventional formulas, and all the ps sed 5 
sume less than do conventional formulas. The terminology 168101 
that of psychological and sociological testing, but the nn of” 
actually provides a general analysis of the reliability of the sU 


variables. (Courtesy of Psychometrika.) 


А оп of Respony, 

Harris, Robert E. and Christiansen, Carole. “Prediction О (1946) 

to Brief Psychotherapy.” Journal of Psychology, XI 

269-284. dictabilit 

The purposes of the study were: (1) To discover PT* іку char 
of response to psychotherapy; (2) to discover the persons. о ales 
acteristics associated with different responses. Twenty overi"? 
and 24 females were drawn from a population of patients plots 
from physical disease or accident. A brief psychotherapy Регар di 
psychoanalytic methods was used. At the end of the try t 
period each patient was rated by the order of merit mete гой 
judges as to suitability for therapy. The techniques UST™ „дз/с phe 
response to brief psychotherapy were the Minnesota Марв, he 
sonality Inventory, the Rorschach, and the Wechsler-* elle after, 6 
test findings were compared with the clinical ratings e айо, 
therapy. Both techniques showed differences between ia oe е 
responding well and poorly to therapy. A hypothesis j^ the uy 
that ego strength or a factor of stability-modifiability Qs Вг 
sonality are important characteristics in response 10 theraP 
Steele. 


SN 


MEASUREMENT ABSTRACTS 413 


Havi 
ie peu Ed Es M. K., and Pratt, I. E. “Environment 
Aron [eet fie Test: The Performance of Indian Chil- 
50-63, о normal and Social Psychology, ХІЛ (1946), 
aan, занай Draw-a-Man Test was given to 325 Indian 
Navaho ag r- through eleven in the Hopi, Zuni, Zea, Papago 
id at eem ioux tribes. Representative samplings were obtained 
Indian child ve of the nine communities studied. Results show 
ranged im A be superior to white children. Average IQ's 
ens cals (Hopi, First Mesa) to 102 (Sioux, Pine Ridge). 
ioux ir de ug one feni than girls in the Hopi, Zuni, Zea, and 
given to rd orrelations between the Arthur Performance Test 
videne oa same children and the Draw-a-Man Test were low. 
ance on E nts to the conclusion that environment affects perform- 

e Draw-a-Man Test. Betty Steele. 


Hellfri 
tant A. G. "A Factor Analysis of Teacher Abilities." 
| al of Experimental Education, XIV (1945), 166-199. 
oats opens the DES and 
roblems ; teachers’ abilities. Тһе 
comp Pl uve are: (1) The number of common factors in a 
ability, Q measures used in determining the nature of teaching 
Various Ü ) The kinds of factors; (3) The factors measured by 
lutei 2 (4) The factors related to pupil growth; (5) The fac- 
; d to supervisory ratings of teachers. The method of fac- 
thod described by Thurstone. 
tal factor, GKMA: 


Gener 
neral Knowledge and Mental Ability Factor; (2) A supervisory 


rating f 
ct 5 factor, TRS: Teacher Rating Scale Factor; (3) A personality 
i nd Adjustment Factor; (4) An 


ing Attitude toward the "Teaching 
d to evaluate the effectiveness 
pupil growth. Results reveal 
be substituted for the actual 
luating the ability of the teacher. 


Our; 


First Grade Training upon Read- 
t among panish-Ameri- 


Her, 
“Selma E, “The Effect of Pre- 
1 Psychology, XXXVII 


In i 

a carmen and Reading Achievemen 

(1946), пат Journal of Educationa 

, 87- 

Мех hundred Spanish-speaking, children in nine towns in New 
tell, “cre equated as to age and IQ on the Pintner-Cunningham 
School ` {se Test, Form В. One hundred were given one year of pre- 
Vocab yh oning with emphasis on social and emotional adjustment, 
Visual „ы? development, physical development of auditory an 
titug,, Perception, habits of memory; cooperativeness, and social at- 
Эше т. Ae ase improvement of the experimental group in I 
Imtner-Cunningham Primary Intelligence Test, Form A, was 


414 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


17.46 + .64 greater than the control group; on the Metropolitar 
Reading Readiness Test, 40.67 = .55. After a year in the first ee Н 
all children in the experimental group were promoted to the ae 
grade while 80% of the control group were at or below grade Р. a 
ment of 1-3. It was concluded that pre-first grade training ihe 
important factor in success in learning to read among Spa 
American children. Esther Litwak. 


Hilkevitch, Rhea R. “A Study of the Intelligence of pe. 
tionalized Epileptics of the Idiopathic Type.” American JO 
of Orthopsychiatry, XVI (1946), 262-270. Hospital 
Sixty-six epileptic patients in the Dixon, Illinois, SMS, study 
were given Stanford-Binet examinations as part of a broac das in- 
dealing with the social and psychological factors attribute’ rang 
stitutionalized epileptics. Thirty-four males and 32 fema “findings 
ing in age from 8 to 53 years, comprised the group. 5 
in this study seem to verify those of other investigators ; ration M 
general agreement on the independent character of deme js still 
relation to the onset and duration of seizures, even though a nature 
open to question whether deterioration is dependent upon "leteriora" 
of the seizures. The author's implications are that where rt and 38 
tion occurs, it begins early, is probably apparent at the ene: er of 
related to the frequency of seizures; that in a considerable € rathef 
cases feeblemindedness is a likely concomitant with ep! T inm 
than induced by it; and that these two conditions are fac 
stitutionalization. Vernon S. Tracht. 
Howard, Ruth W. "Intellectual and Personality Traits Ps. " 
of Triplets.” Journal of Psychology, XXI (1946); Lu develo? 
A study was undertaken to determine the comparative 
ment of single-born and multiple-born individuals. КӨМЕК; 2 
school and 51 school-age triplets, tests of general pe an E. 
ment, language development, non-language developmen age fro 
sonality were administered. Because the subjects range E 
2 years to 15 years, different batteries of tests were ven both P 
of general ability, language and non-language abilitie їп 
school and school-age triplets were inferior to average eral, Py ow" 
children of their age. The school-age triplets were, 1n cameos на, 
to the average than were pre-school triplets. Scip On р 
ever, from rural districts and lower socio-economic leve $- a 
sonality appraisal this group of triplets were considere 
single-born children. Francis F. Medland. 


+ 1 

Hunt, W. A. and Stevenson, I. “Psychological Testing s ical 

Clinical Psychology: I. Toselligence Testing." Рауйо! { itary 

view, LIII (1946), 25-35. - the mi 

In this first of two articles on psychology’s role ay of thi es 
service, the authors give a broad, comprehensive Surv®? | jab! 


telligence-testing field. They discuss both the assets an 


Я 


Í 


. — —ÁM pum Oh Oe 


MEASUREMENT ABSTRACTS 415 


of abbreviated test forms and techniques, the development of which 
they consider the outstanding contribution of war-time psychology. 

hey feel that the unique opportunity thus presented by this social 
emergency for testing large numbers of the population, a truly ran- 

om sampling, will inevitably lead, as it did with them, to a critical 
re-examination of many academically conceived concepts and a 
Sharpening of the psychologists’ testing “tools.” Vernon S. Tracht. 


Hunt, W. A. and Stevenson, I. “Psychological Testing in Military 
Clinical Psychology: Il. Personality Testing.” Psychological 
Review, LIII (1946), 107-115. 

„Аз in the case of their earlier report on intelligence testing in the 
military Services, the authors state that the vast numbers to be 
tested, plus the shortage of trained personnel, resulted in two char- 
acteristics differentiating military from civilian clinical practice in 
ae application of personality inventories. These are the emphasis 

Pon speed and upon classification and disposition of the cases, rather 

han on any extensive recourse to therapy. Adaptation and refine- 
ment of older tests and techniques, not the invention of new ones, 

dd Characterized personality testing in World War II, the develop- 

оаа of screen tests for use іп neuropsychiatric selection being its 
RE Prominent contribution to postwar clinical psychology. Vernon 

' tracht. 


Jayne, C.D. *A Study of the Relationship Between Teaching Pro- 
piures and Educational Outcomes." Journal of Experimental 
“ducation, XIV (1945), 101-134. ua 

Speci MS Is an end ees investigation of the relationship Deed 

‘cific observable teacher acts and changes produced in pups a5 


fue b de by means of the analysis of 
a уен. Пааша investigations, the 
J 


nts, the objective of the second being the learning o specific 
hic. No significant correlations were found Ped bit dif- 
f nics and the educational outcome. The author д ши 
ent procedures were more effective for the different objective 


the à 
two investigations. Irene P. Robinson. 


A ing Children 
VE i :c Study of 50 Stuttering Chileren, 
АЁ pm. tee American Journal of Orthopsychiatry, 


XVI (1946), 127-133. 


fo] he author in this Rorschach part of t 
the Wed ehe same procedure as Corn AER for ag, sex, and ine 
tell; ord-Binet findings, namely, Р blem children. 
Th 18епсе 50 stutterers against 50 non-stuttering pro e 


between 
the PYPose was to note major differences GF petet us 
Stab’? groups. Although both were ipn M tian the problem 
Roy, 22d neurotic, the stutterers wee pct personality dif- 
> Who were kn [саду to ee 
own already 


he study of stuttering, 
ribes in her report on 


416 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ficulties because of previous referral to the Bureau of Child Guidance. 
Data from the Rorschach strongly indicate that stuttering is often 
manifested by obsessive-compulsive traits or neurosis and is closely 
connected with emotional and personality maladjustment. Vernon 


S. Tracht. 


Lawshe, C. H. and Mills, W. B. “Further Studies in the Develop- 
ment of Test Batteries for Identifying Potentially Successt¥ 
Naval eel Trainees.” Journal of Psychology, ХХІ (1946), 
This research was undertaken to determine whether a Navy Tes 

Battery administered at the induction station identified sufficiently 

well the individuals most apt to be successful in a Navy ТтайЁ 

School for electricians. Using the average of eighteen percentage 

grades of the individual as proficiency criterion the predictive Уа ue 

of Battery No. 1 (six Navy tests plus three local tests) and Battey 

No. 2 (six Navy tests only) were determined. № = 100 cases. pen 

means of the Wherry-Doolittle technique, the maximum shrun g 

multiple correlation with the criterion was determined. It es 

found that Battery No. 1 predicted 57% of the subjects’ 814 ү 

within three points, whereas Battery No. 2 predicted 49%, of wae 

subjects’ grades within three points. Because of Naval regulate 
the names of tests used are withheld. Francis F. Medland. 


Lummis, Clifford. “The Relation of School Attendance t0 = * phe 


ment Records, Army Conduct and Performance in Tes 19. 

British Journal of Educational Psychology, XVI (1946), 13- ass" 
. Records of the type of school attendance of 1,000 soldiers P rre" 
ing through an Army Selection Center were tabulated and © loy- 
lated with the records of their Army Conduct, their civilian emp 
ment record and the results of selection tests of general 
mechanical principles, arithmetic, and verbal knowledge. ws I 
relations found range from .3 to almost .7. The author dra 
ferences of significance for educationists. Irene P. Robinson 


rrot 

Postman, L. and Bruner, J. S. “The Reliability of Constant Е ХІ 

in Psychophysical Measurement.” Journal of Psychology 
(1946), 293—299. 

The temporal and spatial order in whic 

able stimulus are presented systematically a 

judgments. This is measured and defined as the constan 


уай? 
h the standard айй оп of 
ffects the distribu Any 
terror- ^uae 


B + o 
measure of the significance of the constant error T€ uces t undef 
tistical test of the null hypothesis that a set of measures Та”, set “4 

" oe Ё £ 
one spatio-temporal condition differs only by chance hor Meth? 


measures under another spatio-temporal condition. For the ^, 
lved (tim® che 


of Average Error where three parameters are invo 
and handedness) analysis of variance is recomm 
Method of Constant Stimuli Difference two parame 


endet. . yolvé 
ters are n 


MEASUREMENT ABSTRACTS 417 


{time and space). Here the hypothesis is that the obtained distri- 

metrical Paced or “less” is only a chance deviation from a sym- 

since it de istribution. For this case Chi-Square 1s recommended 

rejected termines at what level of confidence the hypothesis may be 
- Francis F. Medland. 


Sarason, Seymour B. and Sarason, Esther Kroop. “The Discrimina- 
fat Value of a Test Pattern in the High Grade Familial Defec- 
Fort ү сит of Clinical Psychology, ]I (1946), 38-49. 

child irs ildren from families with more t 

etermin relatively the same degree of menta 
good and Whether tests differentiate between 
inet (L), the adjustment. Each child was g 
ttoencep i A 25 ur Performance por the Em 
Г alographic examination. Cases W ‹ 

mone —those with Kohs Blocks scores above Binet M.A’s by ү 

Kohs-b and those with Kohs scores below Binet by 18 months. 

brai clow-Binet group failed tests characteristic О 


rain 7 \ 
patholo itati i formance was disorganized, 
™pulsiy, gy. Qualitatively their periorm? 
c, lacked т istence. Although both groups 
E deme schach, the Kohs-above- 


motional disturb n the Ror -abo 

n al disturbance o Y 

tional group seemed more stable. This group had "good" ie med 

of abno ords. Of the Kohs-below-Binet group, 60%, had some form 

had surmal record on EEG, while only 189, of Kohs-above-Binet 
ch records. Esther Litwak. 


1 
$ 


—— 


Th 
“stone, L, L, “Factor Analysis and Body Types.’ Psychome- 


trika XI 
A fa ST (1946), 15-30. | И 
thro, ctorial analysis was made of a small u— do m 


b Pometric qe 
ied b Si ap era small battery has been 
seem “id pe author for teaching purposes. Several 0 
сотргер сс meaningful, but their acceptance must дере" number 
* ca ensive studies of body measurements, with a larg 
surements, (Courtesy of Psychometrika. 
with Equivalent 


e t 
: Ledyard R. “Maximum Validity of a Tes 
Чары а: Psychometrika, XI 1946), 1- unction exists and 
hat th assumed that a scale of true score tly is a curve of the 
of, Probability of answering ап item correc vA moment COI- 
relatio the integral of the normal curve- The Pi duc for a norma 
strip Detween the test score and true score 15 CeT qivalent items. 
b теңе}! of subjects and a test composed correlation 
a H 
wiWeen examples demonstrate that dará one-hundred-item test 
(een he Scores and true scores oc y less than three-tenths. 
Ourte Point correlation between items 
es Г 
Y of Psychometrika.) 


418 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Tuckman, Jacob. “A Comparison. of the Reliability and Perform- 
ance for the Minnesota Rate of Manipulation Test for Subjects 
Tested Individually and in Groups of Two." Journal of Applied 
Psychology, XXX (1946), 37-41. NT. 
In a study conducted to determine the differences in reliability 

and in test performance in the Minnesota Rate of Manipulation est 

between subjects tested individually and in groups of two, a com- 
parison was made of test scores for Placing and Turning for 463 high- 
school boys and girls tested individually and 385 high-school boys 
and girls tested in groups of two. For both Placing and Turning; 
reliability coefficients tended to be higher for boys and girls test 
individually, though differences were not statistically reliable wi 

the combined groups were compared. The performance of su ee 3 

was found, however, to be significantly faster on both tests for pe 

and girls tested in groups of two. A table is included presen! in 
separate norms for high-school students tested individua iy an 
groups of two. Frances Smith. 


a . ” 
Von Eschen, C. R. “The Improvability of Teachers in Service 


Journal of Experimental Education, XIV (1945), 135-156. { 57 

The effects on teacher success of a supervisory prog" upi 
seventh- and eighth-grade teachers in terms of measurable P 
changes were studied experimentally in one- and two-room yit 
schools. The supervisory program consisted of twelve d of 
each teacher during which emphasis was put on the deve ep apply 
reading and basic study skills, teaching pupils to make © pup! 
generalizations, practical helps for improving instruction an Hange 
achievement, etc. Group comparisons were made of the C reas: 
between the initial test scores and the final scores in eig ted t° 
Changes in four teacher-qualities found most closely relatat was 
teacher success were also measured. The supervisory pim go ee 
most effective in producing pupil growth in some О : 
ditional educational objectives and in areas in which the program cher 
most concentrated. There was no significant change 1? Lad hip ар" 
quality, but the positive change in teacher-pupil б 
proached statistical significance. Esther Litwak. 


In 
of Young y 


Wall, W. D. “The Educational Interests of a Group rational P5. 


dustrial Workers." The British Journal of Edi 


MEASUREMENT ABSTRACTS 419 


Watson, К.І. “The Use of the Wechsler-Bellevue Scales: A Supple- 
ment." Psychological Bulletin, XLIII (1946), 61-68. 

A discussion of findings obtained from the use of the Wechsler- 
ellevue Scales, supplementing the article by A. I. Rabin, “Use of the 
echsler-Bellevue Scales with Normal and Abnormal Persons,” 
Sychological Bulletin, XLII, 410-422. Studies in the literature 

additional to those mentioned by Rabin are cited. Comparisons 
tween the Wechsler-Bellevue Scale and other measures indicate 

fairly high correlations between the Wechsler-Bellevue Scales and 

Verbal measures of intelligence, lower though substantial correlations 

with performance-type scales, and a trend of relatively higher Wechs- 

&r-Bellevue IQ's for duller subjects and relatively lower ones for 

"ghter subjects. Studies of scatter of Wechsler-Bellevue scores 
pa of the psychological functions tapped by the subtests support 
© author's contention that while the Wechsler-Bellevue Scales 

Supplement other diagnostic devices they do not supplant them, and 

fille much work remains to be done before the meaning of subtest 
“Гез can be established. Frances Smith. 


Wiese, Mildred J. and Cole, Stewart G. “A Study of Children’s 
ttitudes and the Influence of a Commercial Motion Picture.” 
Journal of Psychology, XXI (1946), 151-171. | : 
and p Purposes of the study were: (a) To examine the information 
vee eliefs held by high-school youth regarding the differences be- 
t П the American and Nazi ways of life; (b) to discover changes 
the Ws, Information and beliefs after seeing the picture, Tomorrow 
igh orld. A free response test was given to 1,500 students from 
Cit Schools in Pasadena, Willowbrook, Beverly Hills, and Salt Lake 
Cernin sults show that high-school students are well informed con- 
infor the traditional tenets of American life and are less well 
Jud med on those of Nazi life. The picture softened the students 
Spo dents of the severity of the Nazi regime. The students’ re- 
Berty егей according to their economic and social background. 
eele 


ADDITIONAL ARTICLES NOT ABSTRACTED 


A 
i W. D. “The Comparative Validities of Two Tests of General 
AiPtitudes in an Army Special Training Center." Journal of 
Вауу 2204 Psychology, XXX (1946), 42-44. | 
R? B. and Potechin, E. “A Simplified Form for Reporting Test 
Berg; Ssults.”” Journal of Applied Psychology, XXX (1946), 32-36, 
А Ralp . “Range of Interests and Psychopathologies. 
Bio qnal of Clinical Psychology, ЇЇ (1946), 161-166. _ 
pl E, “Оп the Interpretation of the Correlation Coef- 
eee as a Measure of Predictive Efficiency.” Journal of Edu- 
Bure, “Coral Psychology, XXXVII (1946), 65-76. 
y Cyril. “The ‘Assessment of Personality.” British Journal 
“cational Psychology, XV (1945), 107-126. 


420 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Carlson, Hilding B. “A Simple Orthogonal Multiple Factor Approxi- 
mation Procedure.” Psychometrika, X (1945), 283-301. ere 
Combs, Arthur W. “A Method of Analysis for the Thematic AB < 
ception Test and Autobiography.” Journal of Clinical Psy 
chology, ЇЇ (1946), 167-174. i tes” 
Corsini, R. “Season of Birth and Mental Ability of Prison Inmates: 
Journal of Social Psychology, XXIII (1946), 65-72. «A Rapid 
Cummings, S. B., MacPhee, Н. М. and Wright, Н. Е. x A 
Method of Estimating the IQ's of Subnormal White Adu 
Journal of Psychology, XXI (1946), 81-89. the 
Detchen, Lily. “The Effect of a Measure of Interest Factors on re 
Prediction of Performance in a College Social Sciences Са 
hension Examination.” Journal of Educational Psych? 
XXXVII (1946), 45-52. imental 
Dimmick, F. L. *A Color Aptitude Test, 1940 D 10-22. 
Edition.” Journal of Applied Psychology, XXX (1946), Multi- 
Drake, Lewis E. “A Social I. E. Scale for the Minnesota nology 
phasic Personality Inventory.” Journal of Applied Рзус : 
ХХХ (1946), 51—54. А ational 
Fleege, U. H. and Malone, H. J. “Motivation in Occup ошта! 
Choice Among Junior-Senior High-School Students. 
of Educational Psychology, XXXVII (1946), 77-86. z] 
Forbes, J. K. “The Distribution of Intelligence Among Elen 
School Children in Northern Ireland.” British Jour” 


tary 
me ds 


cational Psychology, XV (1946), 139-145. ; 
Franck, Kate. о dem for Sex Symbols and Their Er (1946) 
ie lates Genetic Psychology Monographs, XXXI 5 
= Я R t 
Gaskill, Harold V. and Fritz, Martin F. “Basal Metabolism Le 
es ud enun Psychological Test." Journal of ic 
sychology, XXXIV (1946), 29-45. phas 
Gough, H. G. ха а оп the Minnesota ee П 
Personality Inventory.” Journal of Clinical Psych А 
(1946), 23-37, ‚р Personality 
Gruen, Emily W. “Level of Aspiration in Relation to 1 (1945), 
Factors in Adolescents.” Child Development; 
181-188. "—-— 
Hsu, E. H. “A Factorial Analysis of Olfaction. 
XI (1946), 3142. Pencil 155? 
Jackson, Joseph. “The Relative Effectiveness of Paper. valu? 
Interview, and Ratings as Techniques for IA 135-54 6 
tion.” Journal of Social Psychology, XXIII (194 XI (19 1 
m. шап, “Serial Correlation.” Psychometrika, 
280, С 
Keir, Gertrude. “Some Sex Differences in Attitude Towards ren 
of Environment Among Evacuated Central re 1 
British Journal Educational Psychology, XV (1946): {seful Тү 
Lindzey, Gardner E. “Four Psychometric Technique’ оду, 
Vocational Guidance.” Journal of Clinical Psy 
(1946), 157-160. 


nalitY 


pha 
Psychomettit f 


MEASUREMENT ABSTRACTS 421 


Malamud, Daniel I. “Value of the Maller Controlled Association 
est as a Screening Device.” Journal of Psychology, XXI 
(1946), 37-43. 

Malamud, R. Е. and Malamud, D. I. *The Multiple Choice Ror- 
Schach: A Critical Examination of Its Scoring System.” 
Journal of Psych ology, XXI (1946), 237-242. 

McNamara, W. J. and Weitzman, E. “The Economy of Item 

nalysis with the I B M Graphic Item Counter.” Journal of 

Mi Applied Psychology, XXX (1946), 84—90. 

iles, D. W., Wilkins, W. L., Lester, D. W. and Hutchens, W. H. 
“The Efficiency of a High-Speed Screening Procedure in Detect- 
ing the Neuropsychiatrically Unfit at a U. S. Marine Corps 
КЕКТЕ Training Depot.” Journal of Psychology, XXI (1946), 

Patrick, Catharine. “Different Responses Produced by Good and 
Poor Art.” Journal of General Psychology, XXXIV (1946), 

Petch, J.A. А Comparison of the Orders of Merit of H. S. C. 
Candidates Offering Two Modern Languages.” British Journal 

Rast ducational Psychology, XV (1946). 

ashkis, H. A. and Shaskan, D. A. “The Effects of Group Psycho- 
therapy on Personality Inventory Scores.” American Journal 

Ra 5 Orthopsychiatry, XVI (1946), 345-349. 

S S 1, H., Cushman, J. Е. and Landis, С. “A New Method for 

tudying Disorders of Conceptual Thinking.” Journal of Ab- 

m normal and Social Psychology, XLI (1946), 70-74. 
cnzweig, S., Clarke, Н. J., Garfield, M. S. and Lehndorff, A. 
$ coring Samples for the Rosenzweig Picture-Frustration 

Smith 09У: Journal of Psychology, XXI (1946), 45-72. ; 
ea H. “Attitudes Toward Soviet Russia: I. The Standardiza- 
er of a Scale and Some Distributions of Scores.” Journal of 

Sbri ocial Psychology, XXIII (1946), 3-16. а 

nger, №. N. “À Short Form of the Wechsler-Bellevue Intelli- 

Sence Test as Applied to Naval Personnel.” American Journal 

Eis of Orthopsychiatry, XVI (1946), 341-344. à a 
ng, E. K., Jr. “Interests of Senior and Junior Public Adminis- 

Thu trators.” Journal of Applied Psychology, XXX (1946), 55-71. 
"stone, L. T, “The Prediction of Choice.” Psychometrika, X 

Wack 1945), 237-253. 

ner, Trude S, “Interpretation of Spontaneous Drawings and 
3 a Genetic Psychology Monographs, XX XIII (1946), 


Welch, L, Diethelm, O. and Long, L. “Measurement of Hyper- 
Sociative Activity During Elation.” Journal of Psychology, 
Weld, 1 (1946), 113-126. . 
>~ and Long, L. “Psychopathological Defects in Inductive 
Wern e3Soning. Journal of Psychology, ХХІ (1946), 201—226. 
Fi Heinz. “Abnormal and Subnormal Rigidity.” Journal of 
normal and Social Psychology, XLI (1946), 3-24. 


422 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Wright, M. E. “Use of the Shipley-Hartford Test in Evaluating In- 
tellectual Functioning of Neuropsychiatric Patients.” Journal 
of Applied Psychology, XXX (1946), 45-50. 

Yacorzynski, G. K. and Newmann, C. A. “A Quantitative Ap- 
proach to the Study of Responses of Psychotics in the Comple- 
tion of Figures Involving Visual and Motor Components. 
Journal of General Psychology, XXXIV (1946), 19-27. 


THE CONTRIBUTORS 


Dorothy C. Adkins—Ph.D., Ohio State University, 1937. 
raduate Assistant in Psychology, Ohio State University, 1931-1932. 
Ssistant in Psychology, Ohio State University, 1932-1936. Assis- 

tant Examiner, Board of Examinations, University of Chicago, 1938- 
1 ‚ Assistant Chief, 1940, and Chief, Research and Test Con- 
struction Section, State Technical Advisory Service, Social Security 
Board, 1940-1944. Chief, Social Sciences and Administration, Test 
Development Unit, United States Civil Service Commission, 1945—. 
uthor of articles on test construction and statistical methods applied 
9 test results. Associate Member, American Psychological Associa- 
Поп, Member, Psychometric Society. Assistant Managing Editor 
of Psychometrika, 1938-. Associate Editor of EDUCATIONAL AND 
SYCHOLOGICAL MEasureMent, 1940-. 


t 


Kenneth L, Bean—Ph.D., University of Michigan, 1938. Assis- 
pat Department of Psychology, University of Michigan, 1936-1937. 
193 chological Interne, Guidance Center, New Orleans, Louisiana, 

939-1940, nstructor of Psychology, Marshall College, Hunting- 
en West Virginia, 1940-1941. Instructor of Psychology, Bethany 
Ollege, Bethany, West Virginia, 1942. Personnel Technician, Ex- 
ошар Division, Louisiana Department of State Civil Service, 
42. uthor of articles on clinical psychology and the psychology 
Associate Member, American Psychological Association. 


, Ray Н. Bixler—M.A., Ohio State University, 1942.  Psycholo- 
gst, Akron Child Guidance Center, 1943-1944. Counselor, 1944, and 
ог Counselor Student Counseling Bureau, University of Minne- 
sota, 1945... Author of articles in the Journal of Clinical Psychology 
and the Journal of Consulting Psychology. Member, American 


р, 
Sychology Association. 


Joseph Ban University of Minnesota, 1939. Chief, 
sonne] ie Unt o е А, Air Technical Service Сот- 
cand, 1942_, Employed by Examining Division of the Los Angeles 
Ed Civil Service Commission and the Los Angeles Board of 
Ucation, 


Edward g ; Я University, 1942. Special 
.B —Ph.D., Ohio State University, Р 
Research Audemars. Tus, State University, 1938-1939. Assistant to 
or. 9rdinator of Student Personnel Services, University of Minne- 
а, 1939. 1940. Assistant to the Director of Student Counseling 


423 


424 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Bureau (then called the University Testing Bureau), University of 
Minnesota, 1940-1941. Counselor, Student Counseling Bureau, Uni- 
versity of Minnesota, 1941-1942. Personnel Technician, Personne 
Research Section, AGO, War Department, 1942-1945. Senior Coun- 
selor and Assistant Professor of Psychology, Student Counseling 
Bureau, University of Minnesota, 1945. Acting Director of Student 
Counseling Bureau, University of Minnesota, 1945-. Author © 
articles on statistical and experimental methodology, research Jn 
counseling and test theory and analysis. Associate Member, Ameri- 
can Psychological Association. Member, Psychometric Society» 
American Society for Aesthetics. 


Wilbur S. Gregory—Ph.D., Syracuse University, 1937. Special 
Advisor to Freshman and Instructor of Psychology, University 9 

Nebraska, 1937-1940. Guidance Consultant and Assistant FTO" 
fessor of Psychology, University of Nebraska, 1940-1942. Service Ш 
the U. S. Army Air Forces, 1942-1946. Guidance Consultant ап 

Assistant Professor of Psychology, University of Nebraska, 1 
Author of articles in the fields of social and clinical psychologY an 

guidance. Member, American Psychological Association, America? 
Association for the Advancement of Science, American College Per 
sonnel Association. 


Thomas Willard Harrell—Ph.D., Johns Hopkins University» 
1936. Instructor of Psychology, 1936-1939; Assistant Profesia 
1939-1945 (on leave 1940-1945); Associate Professor, 1945-, 
versity of Illinois. Personnel research in cotton textile 
Callaway Mills, summer of 1935, Columbus Plant of Bibb Mar 
facturing Company, summer of 1936, Georgia Engineering 
ment Station, summer of 1937. Research Consultant tO ; 


Williams and Cleary, summers of 1939 and 1940. Engaged in 0. 


sonnel research and personnel administration, Army an > MEA 
1945. Author of articles in EDUCATIONAL AND PsyCHOLOGICA® "ier, 
SUREMENT, Psychological Bulletin, and other journals. enois 


American Psychological Association, Psychometric Societys ": «on 
Association for Applied Psychology. Fellow, American Associa 
for the Advancement of Science. 


mel 
H. M. Hildreth—Ph.D., Syracuse University, 1935. Clint 
work, 1930-1935. Instructor of Psychology, 1936-1938; AS% ni- 
Professor, 1938-1940; Associate Professor, 1940-1942, Syracuse Ње? 


versity. United States Naval Reserve, active duty 1942- erica? 
American Association for the Advancement of Science; i 
Orthopsychiatric Association, American Association O 


Professors, Sigma Xi. Fellow, American Psychological 


niversity 
Associati?” 
Я pias 
D. Welty Lefever—Ph.D., University of Southern Califor ali- 
1927. Member of the Faculty of the University of Souther? want 
fornia since 1926. At present, Professor of Education. ; ager vic? 
to the Personnel Testing Unit, San Bernardino Air Technica 


THE CONTRIBUTORS 425 


Command. Author of Predicative Values of Certain Groupings of 
the Test Elements of the Thorndike Intelligence Examinations. 
Co-author of Principles and Techniques of Guidance. Member, Phi 
Kappa Phi, Phi Delta Kappa. 


Milton M. Mandell—B.A., New York University, 1933. Assis- 
tant Director of Examinations, Los Angeles City Civil Service Com- 
mission, 1939-1940. Classification Consultant, State of Connecticut, 
1940-1941, Regional Personnel Officer, OEM, 1941-1942. Per- 
sonnel Officer, Office of Program Vice-Chairman, War Production 
Board, 1942-1943. Chief Analyst, Committee for Congested Areas, 
1943-1944, Chief, Administrative and Management Testing, U. S 
Civil Service Commission, 1944— Member, American Society of 
Public Administration, Civil Service Assembly. 


Charles I, Mosier—Ph.D., University of Chicago,1937. Instruc- 
toriof Psychology and Vocational Guidance Counselor, University of 
Orida, 1933-1936. Assistant Professor of Psychology, University 
9 Florida, 1937-1939. Acting University Examiner, University of 
Тона, 1938. Assistant Examiner, Sloan Research Project, 1940- 
Sero, crsonnel Research Technician, State Technical Advisory 

ue Social Security Board, 1941; Chief of Position Classification, 
> Chief of Personnel Methods and Standards, 1943-1944; Chief 
$, Research and Test Construction, 1945-. Author of articles in 
cj y chometrika, Psychological Review, Journal of Educational Psy- 
1 “ology, and other journals. Associate Member, American Psycho- 
gical Association. Member, Psychometric Society, Southern Re- 
at Committee of the Social Science Research Council. Member 
Ge the editorial boards of Psychometrika and EDUCATIONAL AND Psy- 
TOLOGIcaL MEASUREMENT; 


Anne Roe— mbia University, 1932. _Neuronorms 
Research [ors a 1931-1933. Assistant Psycholo- 
col? Worcester State Hospital, 1933-1934. Director, Survey of Al- 
ohol ducation, 1941-1943; Statistical Consultant, Foster Child 

= 1941-1943, Psychologist, Section on Alcohol Studies, Lab- 
aunty of Applied’ Physiology, Yale University,1943-1946. Co- 
thor of Adult Intelligence, Quantitative Zoology, Intelligence tn 
ental Disorder, Adult Adjustment of Foster Children. Author of 
Y Ley of Alcohol Education in Elementary and High Schools in ie 
nited States, Alcohol and Creative Work, and articles, y er, 
Sychological Association, Metropolitan New cr Ja 
awe of Applied Psychologists, National Council of ied d зу 

981505, American Society for Research in pe oot та xS - 

Fell Orschach Institute, Society of Vertebrate Paleontology. 


1 H B B B d m 
of Seiken eat Psychological Association, New York Academy 


" 


NEW STANDARDS FOR TEST EVALUATION" 


J. P. GUILFORD 
University of Southern California 

Іт 15 common tradition that no psychological test should be 
utilized unless it possesses a high degree of reliability and at 
least a moderate degree of validity. Reliability and validity, 
however derived operationally, have been the two standard 
criteria of the worth of a test. It is not the purpose of this 
discussion to propose that the general practice of evaluating 
tests be discarded, but rather to suggest some drastic revisions 
In its applications and to propose some additional criteria of 
the goodness of a test, criteria which may become even more 
important than reliability and validity as we have known them. 
The textbooks very commonly set forth the rule that no test 
should be used to discriminate among individuals unless its 
reliability is as high as .90 (some say .94 and some say .96). 
here also seems to be common tradition that a test (or battery 
of tests yielding a single composite score) is of little practical 
use in making predictions unless the correlation of scores with 
Some criterion of success or of adjustment is as high as .45. 

ese standards need serious re-examination. 

; There are other conceptions of reliability and validity that 
wal bear careful inspection, in view of recent experiences of the 
Writer and others who took part in the Army Air Forces psycho- 
gical program. Concerning reliability, there seem to be com- 
um Opinions (1) that each test has an absolute reliability 
ita m that is characteristic of it; (2) that high reliability 
"E sel goal in and of itself; (3) that a test cannot be 
т nless it has a substantial degree of reliability; and (4) 
eed increasing the reliability of a test we automatically 

€ its validity. 
Based | 


& 
at Stanford Upon a paper read before the Western Psychological Association meeting 
td University, June 29, 1946. 


е 427 


428 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Concerning validity, there seem to be common opinions that 
(1) validities of .50 to .60 are the practical upper limits of 
correlation between test scores and criteria of success; (2) 
validities of .10 and .20 are so inconsequential that tests with 
such small predictive values are not worth using, even in test 
batteries; (3) each test in a battery should have a maximum 
correlation with the practical criterion; (4) after combining 
four or five tests in a battery, the validity of the composite can- 
not be materially increased by adding more tests; (5) there 
would be no question concerning the utility of tests with validi- 
ties of .60 to .80; and (6) tests are valid if by inspection they 
obviously look valid. 
All of these conceptions and conclusions will be briefly called 
into question. Before proceeding, however, the terms “relia- 
bility” and “validity” require better definition. Statistically 
defined, reliability is the proportion of non-error variance in the 
total-test scores. From this point on, there is often disagree- 
ment as to which contributions to total variance should be con- 
sidered as error variance and which should not. The various 
operations by which reliability is estimated—internal consis- 
tency, alternate forms, and test-retest—rest upon different 
assumptions on this question. In the following discussion an 
internal-consistency reliability (estimated from odd-even cor- 
relation, Kuder-Richardson method, and the like) will be meant 
unless otherwise specified, Even under this restriction, an esti- 
mated reliability coefficient will vary from one population t? 
another, and will depend upon other factors, including the test- 
ing conditions and the scoring formula. Validity, in my opinion; 
is of two kinds: factorial and practical. The factorial validity 
of a test is given by its loadings in meaningful, common, refer- 
ence factors. This is the kind of validity that is really meant 
when the question is asked “Does this test measure what it 15 
supposed to measure?" A more pertinent question should be 
“What does this test measure?” The answer then should be 
in terms of factors and their loadings. The practical validity 
of a test is given by its correlation with a practical criterion ° 
adjustment, vocational or personal. In the following discus- 
sion, practical validity is meant unless otherwise specified. ^? 


р om 


NEW STANDARDS FOR TEST EVALUATION 429 


a very general sense, a test is valid for anything with which it 
correlates. t 

Before examining the prevailing conceptions point by point, 
one or two general statements should be made. In the evalu- 
ation of tests for practical use, practical considerations should 
be permitted to enter the picture, and realistic conceptions 
should prevail. Tests are generally used in selection and classi- 
fication of personnel, and in vocational and personal guidance 
of individuals. In selection and classification we are usually 
concerned with composite scores; in clinical testing we are fre- 
quently concerned with single test scores as well. Judging the 
Worth of a test will differ somewhat according to whether it 
Provides a separate evaluation of individuals or whether it 
Serves as a member of a team. This difference is not always 
recognized. Lower reliabilities and validities can be tolerated 
1n tests used in combination with others than when tests are 
used Separately. It is commonly recognized that a composite 
Score almost always has greater validity than any of the single 
Scores that enter into it. It is not so often realized that a com- 
Posite score will also be more reliable than part scores, if there 
15 intercorrelation among the part scores, as there usually is. 

The comments that follow will be more intelligible if viewed 
9n the background of factor theory. It is one of the definite 
convictions of the writer that factorial conceptions of tests give 
us the most illuminating and useful basis for drawing conclu- 
sions regarding the issues involved in test practice. This con- 
Viction goes so far as to maintain that the most meaningful, 
economical, and controllable type of test battery is one that is 
Composed of factorially pure or unique tests. If these general 
Principles are accepted, most of the issues under discussion are 
automatically decided. The reader need not accept these prin- 
ciples in order to agree with some of the conclusions that follow. 

ceptance of those conclusions, however, will take one a long 

Way toward agreement with the principles. 

Need tests achieve a reliability of .90 or higher for useful 
individual measurement? ‘This rule can be traced back to 


elley? à : 2 
malleys mathematical rationale of the measurement problem. 


Book Е «еу, Т.р. у nterpretation of Educational Measurements. New York: World 


Отрапу, 1927. Р; 210 ff. 


430 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


There is no disputing his conclusion or the rule, if one accepts 
his premises. They have to do with the accuracy of measure- 
ment. I believe, however, that his premises, and hence the rule 
that follows from them, are quite unrealistic from the practical 
point of view. If the rule were to be rigorously followed, the 
greater part of present testing would have to be abandoned. 
It is admittedly important that the test user be aware of the 
margin of error in obtained scores (although the issue goes 
much deeper than that, as I hope subsequent discussion will 
show). But awareness of the margin of errors is quite a differ- 
ent thing from rejecting tests entirely because they do not meet 
some arbitrary degree of accuracy. I venture to say that relia- 
bilities are characteristically below .90 rather than above .90; 
as ordinarily estimated. An inspection of a sample of 74 of the 
Army Air Forces tests designed for selection and classification 
Purposes showed that the median reliability was .80 and the 
range was from .10 to .97. Not all of these tests were by any 
means put into use, but many a test whose reliability was below 
80 Was useful in a battery or could be useful, Three rather 
dramatic instances might be mentioned. One test on judg- 
ments of lengths of lines, a very short test, had a reliability of 
25 and a validity for pilot selection of 23. This validity repre- 
sented an almost unique contribution, A biographical-data 
test, scored for navigator selection, had a reliability of .35 and 
a validity of .23, much of which was a unique contribution- 
A 15-item test of practical judgment had a reliability of .36 and 
a validity for pilot selection of .36. All of these statistics were 
based upon large samples and so are rather stable. In order t° 
achieve a reliability of 94, according to the Spearman-Brow? 
principle, the judgment test would have to be lengthened t? 
include about 400 items and would require about seven hours 
testing time. 

It might be pointed out in this connection that а test c? 
actually be valid and yet have zero internal-consistency relia- 
bility. There are certain types of tests, such as biographic? 
data and general information, quite heterogeneous in content 
functionally, of which this could be true. If one selected item? 
by validating each separately against a job criterion and at the 


NEW STANDARDS FOR TEST EVALUATION 431 


Same time by seeking items with minimal intercorrelation, this 
extreme condition would be approached. Although the inter- 
nal-consistency would be very low, the test-retest reliability 
would, of course, be higher. 

Need tests have a validity greater than .45 to be practically 
useful? This rule stems from the use of the index of forecasting 
efficiency, which equals about 10 per cent when r equals .45.5 
Statistically, this rule is incontestable, provided one accepts the 
arbitrary limit of 10 per cent efficiency so defined. When the 
approach is in terms of practical costs and utilities, however, the 
Standards look very different. The criterion proposed in recent 
Years by Taylor and Russell,‘ which is based upon the success 
ratio with and without the benefit of testing; and the criterion 
Proposed by Richardson, which is based upon a proficiency 
Tatio, are not only more realistic but also preclude the use of 
any fixed minimum coefficient of validity for the purpose of 
accepting or rejecting tests. Under certain favorable conditions 
of Selection, validities as low as .20 and even .10 may prove to 
е of practical utility." Under unfavorable conditions of selec- 
ton, validities as high as .60 and even .70, may indicate little 
Value of a test in selection. Two favorable conditions for selec- 
Чоп are (1) a job situation in which without the use of tests 
Most applicants would fail, and (2) a labor market such that 
"апу applicants сап be rejected. The converse situations are 
scnerally unfavorable for effective selection by means of tests. 
za the clinical use of tests, other kinds of standards are needed. 


р 2 
Chap НЧ C. І. Aptitude Testing. New York: World Book Company, 1928. 
i ie i i Validity Coefficients 
aylor, Н. С. and Russell, J. Т. “The Relationship of Va i 
Xx I Tactical Effectiveness of das in Selection." Journal of Applied Psychology, 
(1939), 565—578 E e 
i t M. y i dity Coefficient in 
Ichardson “The Interpretation of a Test Vali 3 
Terms of ieu Te) lucy of a Selected Group of Personnel.” Psychometrika, 
» 245-248, "ee кы did 
Tecent i f ill-informed application of validity stan ards 
appears in an article ne Alber Ente “The Validity of Personality Questionnaires,” in 
Б е Psychological Bulletin, XL (1946) 385-440. In this instance, the author guene 
Чу JCPOrts “conventional estimations” as follows: p gem e Lee ege ftom: 0 
as “ ive жапды” .40 through .69 as “questionably positive, 
. O and bore by ae a What is worse, his general conclusions will 
validen misleading to the reader who finds the statement та ч сано 
аге i i : ў 7 itive uestion: 9 
quest nts, 15 gave positive, 6 q positi 
Ee 13 negative тышы а лапоть fails то note the short paragraph in which 
Unusual statistical standards are announced. 


5 


432 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


It is well for the clinician to keep in mind the standard error of 
estimate of a test, if he has one at his disposal. But even then 
it is doubtful whether any fixed minimum standard should be 
universally applied. 

It is sometimes said that if the validity of a test is high, 
we need not be concerned about its reliability. To that point 
of view the writer heartily subscribes. Relatively too much 
attention has been given to reliability and too little to validity- 
This is partly because the factorial validity (and too often, also, 
the practical validity) of a test has been taken too much for 
granted. High reliability should never be regarded as a desira- 
ble goal in and of itself, It is important only insofar as it con 
tributes to validity. Contrary to what the textbooks lead one 
to believe, increasing the reliability of a test will not necessarily 
add to its validity. Validity will increase only when improved 
reliability means an increase in variance contributed by factors 
that the test has in common with the criterion. 

Let us assume that a test measures simultaneously two com 
mon factors—reasoning and number ability (plus other com” 
mon factors that we can ignore at the moment). This is true 
of most arithmetic reasoning tests. Let us assume, further, thes 
the reasoning-factor variance is also a component of the JO 
requirements of a supervisor of clerks, but that the number- 
ability variance is of no importance. Suppose that in an “ 
tempt to make the test more reliable, an examiner alters 1t А 
such а way as to increase the number-factor variance in thé 
test, leaving the reasoning-factor variance unchanged. Л 
test thus becomes more reliable but no more valid than befor? 
for the selection of clerical supervisors. On the other hand, " 
examiner who knew the factor composition of the test an i. 
the criterion, would attempt systematically to reduce the nice 
ber variance and to increase the reasoning variance. The i 
might be that the reliability is unchanged but the validity 
would be increased. t 

It is the amount of variance in valid factors in a test Ee 
counts. The invalid common-factor variance, though contri e 
uting to reliability, might just as well be error variance. T 
are even grounds for arguing that it would be better if t 


NEW STANDARDS FOR TEST EVALUATION 433 


invalid variance were given over to error variance. Invalid 
common-factor variance biases selection in a certain direction, 
. Whereas error variance does not. This becomes serious in case 
the invalid variance has a negative correlation with the criterion 
and yet the test is weighted positively for selection. One or two 
instances of this kind were encountered in Army Air Forces 
testing, e.g., a reading-comprehension test whose mechanical 
factor had positive validity but whose verbal factor had nega- 
tive validity for pilot selection. Whenever the invalid variance 
is non-error and ordinarily provides selection for “good” quali- 
ties, however, it can be argued that little or no harm is done by 
leaving this variance in a test. But even so, its contribution 
to reliability has little meaning in this particular application 
of the test. 

Validities in general, not just a sprinkling of them, can be 
materially higher than .50 to .60. The pessimism that has sur- 
rounded most test development in this respect in the past has 
been due to an unwarranted, restricted outlook. The finding 
that tests beyond the fourth or fifth in a battery add very little 
to validity has been due to the fact that the test maker has 
remained within a circumscribed area of human aptitude. The 
Overemphasis given to the concept of general intelligence and 
to the IO is to a large extent responsible for this. The depen- 
dence upon direct observation in job analysis is another deter- 
Miner in this stalemate. Halo effects are present in evaluating 
Jobs by inspection as well as in evaluating people. The conse- 
Quence is that more and more of the same kinds of tests are 
Constructed. The theories behind them may differ and they 
May look different from already constructed tests, but func- 
tionally they remain within the small circle of better known 
abilities, Те js my conviction that only by an objective, empiri- 
cal procedure such as factor analysis can we know what abilities 
and traits are represented in either tests or jobs. It requires 

, Such an approach to enable us to break the shackles of tradition 
and to realize the great richness of human variability that 
actually exists. The most promising way of increasing a multi- 
ple Correlation is to add to a battery, tests with unique valid 
Variance, Another way is to increase the saturations of tests 


434 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


already in a battery with valid factor variances. One can 
hardly accomplish either of these improvements without know- 
ing what the factors are. 

As a concrete example of the foregoing points, I can cite the 
selection of pilot trainees in the Army Air Forces. It was found 
that the 21 scores offered by the classification battery measured 
only eight of the factors that appear to be positively loaded in 
the pilot-training criterion.’ All of these factors, incidentally, 
are foreign to the usual intelligence test. The use of intelligence 
tests for the selection of pilots among those whose IQ's are 
above 100 would be practically futile. From the estimated 
factor loadings of these eight factors in the pilot criterion, It 
could be predicted that those factors, optimally weighted in the 
test composite, would yield a validity of about .60 for that 
composite. This was not far from the validity actually ob- 
tained. From results with experimental tests, it was estimate 
that there were nine other factors having positive loadings 11 
the pilot criterion. Had the classification battery included 
them, properly weighted, the validity of the composite shoul 
have been about .70. There were four other factors in which 
the pilot criterion appeared to have very low negative loadings- 
With these factors also included and appropriately weighted, 
the multiple correlation should be about .72. At least tw? 
unknown factors that appeared to have substantial pilot valid- 
ity were not included in these considerations. New factors wet? 
still undisclosed but indicated before the end of the war. With 
one or two exceptions, the 21 factors with some claim to recog- 
nition in the pilot criterion would ordinarily be called abilities. 
Whatever variances were contributed to the criterion by t€? 
peramental factors were almost untouched. The conclusio? 
should be that the upper limit of validity for any battery ae 
unknown quantity. Any estimate of it needs to be liberal an‘ 
subject to revision as new factors come into the picture. Inc 
dentally, the number of human factors, when they are muc? 
better known, will probably run much larger than has been sup" 


posed. The horizon of aptitudes is slowly but surely extending 
ill 
7 A complete account of the findings upon which these statements are based D, 


be published soon by the Arm Ai 1 ene asi 
Tests, of which the writer is editor. ir Ваше ana: елор, „Рена 


NEW STANDARDS FOR TEST EVALUATION 435 


beyond the confines of the IQ. It is hoped that the horizon of 
temperament will also grow beyond the concepts of neurotic 
tendency and the PQ. 

A validity such as the one just mentioned for pilot selection 
should also be interpreted in the light of the range of aptitude 
within which selection was made and of the reliability of the 
criterion. The obtained validity of .60 pertains to the range of 
talent among applicants who had previously been screened on 
the Army Air Forces Qualifying Examination. Later evidence 
Pointed to a validity figure of .66 for the same composite apti- 
tude score when the range was extended to those who would 
have failed to pass the Qualifying Examination. The relia- 
bility of the pilot-training criterion was never satisfactorily 
estimated, but was probably between .70 and 80. If we are 
Conservative in making a correction for attenuation due to a 
fallible criterion and assume that the reliability was .80, the 
Corrected validity becomes .73. What was the validity of the 
Composite pilot-aptitude score? There are as many answers to 
this question as there were sets of conditions under which the 
composite was derived, applied, and validated. 

One aim, in the construction of a test battery, has usually 

en to maximize the validity of each separate test. The multi- 
Ple-regression principles have fostered this objective. Mathe- 
matically there is nothing wrong with it. From another point 
Ot View, however, the practice is unfortunate in that it works 
Toward factorially complex tests. It is far better, in my opinion, 
to seek a battery of maximally independent, factorially pure 
tests, each with a unique contribution to make. Complex tests 
уе ambiguous scores and are duplicative and wasteful in gen- 
eral use. Pure tests are unambiguous in what they measure, 

€Y are much more manageable when used in combination with 
Элет, and they cover a large range of traits economically. 
dis. Jub criteria are highly complex factorially, but each is 

‚к acterized by a pattern of factorial requirements. The best 

: erential predictions, as in classification of personnel or in 
ma UH guidance, are to be achieved when pure tests are 


Now factorially pure tests, when taken alone, are likely to 


436 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


be less valid than complex tests. This fact works against them 
їп test construction, unless the examiner pays more attention 
toa second principle of multiple regression—that the intercor- 
relations shall be as low as possible. Most test constructors 
have paid more attention to the first principle—maximal valid- 
ity for each test—at the expense of the second. The fact that 
these two principles work in opposition is not sufficiently real- 
ized, and that to satisfy them both requires exacting procedures. 
A good route to independent tests is definitely through factor 
analysis, by which it is possible to recognize the unique соп!" 
butions of tests so that one may make the most of them. In? 
battery, the overall validity of the composite score can be just 
as great with pure tests as with complex tests. The best way 
to satisfy the aims of the multiple-regression principles is t? 
maximize the purity of each test and to maximize the saturation 
in Its one common factor. This should be accompanied by ? 
factorial study of the job criterion in order to determine what 
factors must be covered and how important each one 1s. Fre- 
quently it will be found that a complex test with high validity 
adds nothing to a battery while a pure test with much lower 
Validity may do so, When we ask what is each test's uniq"? 
contribution to a battery rather than what is its total cont” 
bution, we are comparing tests on a much more equitable basis: 
Its validity coefficient, as such, loses much of its importance- 

A final word on validity concerns validation by inspection” 
When validation data are lacking, the construction or the adap- 
tation of a test or a battery to some new use often must proce? 
on the basis of considerable guesswork, call it “crystal ball s 
professional judgment. A natural and relatively safe appro? 
18 to devise a “jobsample” test; one that mimics fairly leary 
the central task of a job or some crucial constituent part 0 t 
job. Such tests have a fair probability of being valid for tha 
Particular job. One example of this is the Complex Coordinat? 
Test that was developed between wars for pilot selection” 
In this test the examinee has to make one set of adjustme? 1 
after another with an imitation pilot’s stick and rudder cont 
in response to changing signals. It looks like a valid test 
sophisticated and unsophisticated alike, and it does prove 


oes 


d — 


NEW STANDARDS FOR TEST EVALUATION 437 


have considerable selective value for pilot trainees. Yet, this 
test has proved to be almost as valid for the selection of aircraft 
mechanics and radio-operator mechanics; it had substantial 
validity for the selection of navigators, bombardiers, and flexi- 
le gunners; and it correlated substantially with scores in pistol 
firing and carbine firing. Furthermore, it had moderate corre- 
ations with a few paper-and-pencil tests that have no super- 
ficial resemblance to it. 

Even sophisticated judgment often goes astray on decisions 
as to what a test measures. А test designed to measure com- 
monsense judgment when factor analyzed turns out to be a test 
of mechanical experience. A test designed as a reasoning test 
18 found to be one of numerical facility, when analyzed. A test 
of pilot interest proves to have some variance, indeed, in that 
bed but it is stronger in variance for the verbal factor. A 
d rci to test the ability to maintain orientation in space 
tes ж E be primarily a measure of perceptual speed. ‘This 
очку е extended. The moral of it is that in test construc- 

. and in job analysis, things are not always what they seem. 

E. aris. because our categories of aptitudes and traits 
aka n aulty. Empirically determined factors, on the other 
ea sufficiently well defined, seem to be stable and 
they hay iron they are amenable to direct observation once 
ён ауу ееп brought to light. This discussion does not 
acd valid argue against the use of "face validity" in tests. 
Eum ng makes tests more palatable to the public. But 
validi ity may have nothing whatever to do with actual 
actual 2 T it should be remembered that the ANE: of 
Validity alidity is never solved just because a test has tace 
jos has preceded, I have attempted to show, without 

e of e than the minimum of proof, that in the practical 

i ity к there can be no absolute standards for either relia- 
Telativice ticity. In this connection one must be a confirmed 
of which, = great many considerations must be noted, many 
Validity s ave a bearing upon each situation. Of the two, 
еер ig oe more important. Much more important than 
€ factorial composition of the test. I predict a time 


438 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


when any test author will be expected to present information 
regarding the factor composition of his tests. Along with ai 
descriptive statistics, whether of reliability, validity, or of e 
torial composition, there should be given more information t! E 
at present concerning the kind of samples and the populatio н 
from which they were drawn, and concerning other pen 

affecting these statistics. Information concerning the validity 


: : : ature 
of a test should be accompanied by details regarding the n 
of the practical criterion. 


— - 


\ 
, 


So н 


Ji 


CLIENT-CENTERED COUNSELING 
C. GILBERT WRENN 


University of Minnesota 


Tue contribution made by Rogers in his published state- 
ments regarding non-directive counseling has been very con- 
siderable. The emphasis has been laid upon what actually 
Заррепѕ to the client as opposed to the counselor’s conclusions 
concerning him. There is little doubt that this is a needed 
emphasis and, although not a new concept, a contribution to 
effective practice. Rogers writes persuasively and it is only 
Upon careful appraisal that one becomes aware of certain incon- 
jistencies in his concepts. All proponents of new ideas or 
emphases are liable to the error of overenthusiasm in their 
арргоасһ and to a belief that the new concept or method will 
Provide a much needed panacea. This enthusiasm coupled with 
Persuasive writing has made Rogers’ publications particularly 
dificult to evaluate (3, 4). 
абды, аа that seems to be in error is that client- 
on ee and non-directive counseling are synony- 
— lata nee say counseling has been used in varying 
carried i emphasis by counselors for generations. Rogers has 
« „this concept to its ultimate extreme and has termed it 
dee eee He has systematized the approach at this 
aoe level and has provided an excellent discussion of pro- 
ау гч to be used and cautions to be observed. He believes 
s mls" counseling is guilty of grave E in ч in 
ions à the counselor assumes responsibility or the conclu- 

ached. For when the student's mental processes are not 
; м of attention, two errors are apt to be in evidence: (1) 
апд ES lack of awareness of the extent to which the diagnosis 
Nelusions of the counselor are accepted and (2) there is 
оза Over of repressed but possibly more fundamental 

Чез in the emotional and rational life of the client. 
ei treatment, however, has been almost a Philippic 
Moti at he terms “directive counseling. In charging 
Counseling with neglect of the client and in proposing 

439 


440 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the advantages of non-directive counseling, he has presented 
counseling as falling into a dichotomy: one category, the direc- 
tive, possessing a complete absence of client-centeredness; pe: 
other, the non-directive, having a completely client-centere 
approach. 

It has seemed useful to investigate the possibility that 
client-centeredness in counseling falls along a continuum б 
emphasis. It is true that certain counselors may invariably 
use an extreme of client-centeredness or counselor-centeredness 
while other counselors may do so only under certain conditions: 
A great deal of counseling, however, falls at other points than 
at the extremes of the continuum suggested. The qum 
arises as to the criteria that might be established to determin 
the extent of client-centeredness to be used in a given situation 
Whatever criteria could be suggested are subject to misuse A 
adopted literally. On the other hand, without such crite 
the counselor fumbles in his attempts early in the interview ks 
adopt the best counseling procedures for a given situation. 3 
possible sets of criteria might be suggested: | 

A. Criteria revolving around the nature of each client, 

varying from one counseling situation to another. 

1. The hypothesis regarding the client need, or to йи 

term recently coined by Bordin (2), the ett 
construct,” which is set up early in the couns® a 
process. Such a construct as “self-conflict of 

“choice-anxiety” clearly calls for a high degre’ a 

non-directiveness while other needs might require 

information emphasis. . 

2. The degree of emotional tension in the client. , 
3. The apparent maturity of the client, his abili 


and 


use à 


ty t° 


4. The apparent urgency of the problem. A p 
which is so urgent that only one interview S 
held before a decision is reached by the = 
demand a high degree of counselor participatio" a : 

5. The extent to which specific information is np a 
This information may be related closely or nor пе d 
to the basic problem of the individual but ee 
for information must be met. 


may 


ча 


CLIENT-CENTERED COUNSELING 441 


6. The apparent degree of dependency of the client. 
This is included as one of Bordin’s “constructs” but 
dependency may.also be a factor in what is an even 
more basic personality need. An attitude of depen- 
dency in a client certainly calls for great carefulness 
in counselor participation. 

B. Criteria residing in the counselor and the counseling 
situation: 

1. The philosophy, training, and versatility of the coun- 
selor may determine the extent to which client- 
centeredness is used. It is foolish to assume that any 
man can change his habits quickly or that some men 
can ever change their long established procedures 
under even favorable circumstances. Regardless of 
the logic involved, the versatility of the counselor in 
the use of counseling procedures depends upon his 
previous experience, his flexibility of mind, and other 
personal factors. This reality must be recognized. 

2. The time allotment for counseling and the case load 
are factors in the situation which may be impossible 
to change. It is to be assumed that the extreme in 
client-centered counseling, the non-directive, is more 
time-consuming than the completely directive, and 
that the time allotment may determine the number 
of cases with which non-directive approaches can be 
utilized. 

3. The amount of test data and pre-counseling informa- 
tion available should be utilized. Counselors may 
find themselves in a situation where it is expected 
that test information previously secured will be util- 
ized and that information will be shared with the 
student. Bixler (1) has recently indicated proce- 
dures whereby the non-directive counselor can utilize 
test information although this is a move toward the 
directive approach in Rogers’ own terms (5). 

4. The nature of referral of the client to the counselor 
is a predetermining factor in a given counseling situ- 
ation. It is assumed by Rogers that if the client 
does not wish to come and if there is no felt need, 


442 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the non-directive approach cannot be utilized. On 
the other hand, when a client is referred by a col- 
league or administrator the referral is made with the 
expectation that counseling will take place. Under 
these conditions the counselor must use his best judg- 
ment to secure rapport with the client and to move 
constructively toward a solution. Skillful stimula- 
tion of the client by the counselor may be necessary 
and may even result in a highly non-directive situ- 
ation after a time, but the counselor who does not 
take active steps regardless of the nature of the re- 
ferral will soon find himself justly accused of ineffec- 
tiveness. The reputation of being a “prima donna 
is hard to live down. 
5. The understanding possessed by both the client 
the administrator of the function of the counselor. 
If the position and reputation of the counselor in the 
situation is such that a decision by him is anticipates 
it will be difficult to use any extreme of non-directiV€" 
ness, Administrative decisions and counseling shoul 
not be confused in the same person but they 16, 
quently are. Rather than “throw in the sponge 
regarding effective counseling, as a non-directV® 
purist might do, such a counselor must meet й 
situation and use client-centered approaches tO the 
degree that he finds possible for each client. | 
These criteria are suggested in the hope of encouraging €? и 
thoughtful reader to establish his own criteria. Most of us par 
counseled for years without any logical basis for determining 
kind of treatment used in a given counseling situation OF ly 
considering the variety of possibilities open to us. Frequent t 
the right thing is done by what might be called intuition D 
the right thing will be done more frequently if thought is gm 
to an adequate basis for determining the treatment 10 be ^l 
lowed. Thoughtful consideration of such criteria as these W! 
not necessarily result in mechanical processes which are is 
mental to effective client-counselor relationships. The pnt 
true of a consideration of possible hypotheses regarding pa d 
problems which underlie a student's surface indication of nee 


and 


SS ЧР 7 


ait )m 


ae: 


CLIENT-CENTERED COUNSELING 443 


The usefulness of these lies in greater clarity of thought during 
the interview rather than in a logical or mechanical selection 
of hypothesis or procedure during the first contact. 

Rogers’ analysis of the extreme client-centered approach has 
been helpful to the counseling profession, provided it is seen in 
Perspective. For a decade or two professional counselors have 
made strenuous attempts to discard paternalism and advice- 
giving in counseling. We have made great strides toward a 
careful intellectual approach to the understanding of the indi- 
vidual and a diagnosis of both his surface and basic needs. The 
Clinical use of tests and of other objective information has 
advanced counseling far above the level of paternalism. The 
fact remains that in this process we may have laid too little 
emphasis upon the emotional and intellectual processes at work 
її the individual during counseling. We have certainly been 
Careless in giving sufficient attention to the degree of acceptance 
by the client of ideas or solutions proposed by the counselor. 
Recent discussion of the non-directive approach has served to 
Jar counselors into a new awareness of the client’s part in the 
Process. That this should divide all counseling into two ex- 
treme positions seems both unsound and unrealistic. All previ- 
се work of “non-directive” counselors has certainly not been 
directive” in the extreme sense. Many of the points empha- 
Sized by Rogers have been previously emphasized in varying 
— by many writers and practitioners although the fact 

-mains that the impetus given by Rogers to our further con- 
Sideration of the total nature of the counseling process has been 
“needed and an effective one. . 
th A Second objection is registered against the assumption that 
trainar direcrive approach it lees diis e he 

€gree уне We Pies eia ad for the training of 
irecti ipii hy шири ee ie -directive ap- 
б. Ive counselors is unnecessary for the non p 
eon He has made the non-directive аре = 
PSycholo сереге о pome n i If-control. This is partly 
i Matter of vein Mi ceni m f the counselor but it 
lS certaj of the personality integration © : е 
nly dependent upon thorough understanding an 


444 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ful training. Shaffer has pointed this out in his review of Rogers 
and Wallen's book (6). But if one adopts the concept that the 
non-directive approach, in the extreme, is only one of several 
which an effective counselor may use, one must also be prepared 
to use varying degrees of directiveness in a skillful interpreta- 
tion of objective information. Then there must be added to out 
present emphasis in professional training a growing insight into 
the nature of drives and mechanisms. and repressions and frus- 
trations, in order to effectively run the gamut of procedures 10 
client-centered counseling. In this we have not subtracted from 
the amount of training necessary but have added to 1t by 4 
cluding the background necessary for skillful non-directiV 
counseling where conditions call for this approach. á 

In summary, the emphasis on the non-directive реса 
has been stimulating to the field of counseling but it 18 not à 
new one nor is it simple. We must give more attention to 
client and less to the counselor, but client-centered € 
is not one part of a dichotomy. It is a continuum. Ski о 
counseling consists of knowing when to use the varying Lapin 
dures that are available along this continuum. And this er. 
tility means adding more emphasis to certain areas of a в А 
sional training program, training that will contribute fO ubt 
psychological insight and skill needed for the extreme of clie 
centered counseling called non-directive. 


REFERENCES 


ion in 

l. Bixler, Ray H. and Bixler, Virginia H. "Test Interpretato or- 

Vocational Counseling? EpucationaL AND PsYCH 

CAL Measurement, VI (1946), 145-155. other 
2. Bordin, Edward S. “Diagnosis in Counseling and Psyc os ENT 

apy.” EDUCATIONAL AND PsvcHoLoGicAL MEASURE 

VI (1946), 169-184. York: 
3. Rogers, Carl R. Counseling and Psychotherapy. Mew 

Houghton-Miffin Company, 1942. а Coun- 
4. Rogers, Carl К. “Psychometric Tests and Client-Centere EgMENT; 

seling.”  EnucATIONAL AND PsvcnorocicAL MEASUR 

VI (1946), 139-144. р Returned 
5. Rogers, Carl R. and Wallen, John L. Counseling with mpany 

Servicemen. New York: McGraw-Hill Book C° 

1946. > len, 
6. Shaffer, Laurence. Review of Carl R. Rogers and John L. XIV 

Counseling with Returned Servicemen. Occupations 

(1946), 520-523. 


MT — 


v 


THE EXPERIMENTAL EVALUATION OF A 
SELECTION PROCEDURE 


JOHN C. FLANAGAN 
University of Pittsburgh 


The Planning of the Experiment 


A COMMON problem for research workers concerned with the 


d ; Е 
. development and improvement of procedures for the selection 


Fie training of personnel is the adequate evaluation of proce- 
“жа ste they have been established. Educational institu- 
Eos а and industrial concerns, and government organi- 
берь aving once accepted certain procedures are generally 
Кн ы suspending the use of these procedures for a large 
——À Broup to obtain an adequate evaluation of them. This 
S it very difficult to refine and to further improve the 
Procedures, 
Pre of the very large numbers of men involved and the 
in the Praese oe of the procedures for the selection of aircrew 
ue ro Air Forces, such an evaluation of these puse 
the val especially desirable. It was believed that a check on 
alue and inter-relation of both the initial screening proce- 
ner and the procedures for qualifying men for pilot training 
€ more comprehensive Aircrew Classification Tests should 
ы This could be accomplished by examining a ш 
а gh sample of applicants with these tests and by sending 
с E the men tested into training, regardless of the test results. 
ташу, a memorandum was prepared entitled “Experi- 
b Pi. Study of Eligibility Requirements for Aviation Cadets 
cal B. Present writer in his position as Chief of the Psychologi- 
ranch in May, 1943. 4 
aried responses were obtained to this proposal from repre- 
e tives of other divisions of the Air Staff. Certain of the 
Hiller Army Officers felt that since procedures had been 


445 


Senta 


446 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


accepted and appeared to be working well, it was unwise to 
conduct a study which might reveal serious defects and weak- 
nesses in them. Others stated that research of this type should 
be carried on in peacetime and should not be allowed to inter- 
fere with established routines for the selection and training of 
men during the war period. One officer suggested that the pro- 
posal to bring in a thousand applicants regardless of test results 
was inconsistent with the proposals by aviation psychologists 
that qualifying standards in terms of pilot stanines be raised. 
Other officers questioned the study on the grounds that the 
value of these new procedures had already been established and 
that further studies were therefore unnecessary. However, the 
argument that the procedures were not perfect and that further 
improvement depended upon such an evaluation won out ano 
on June 21, 1943, the study was approved and a letter was sent 
from the Commanding General, Army Air Forces, to the Com- 
manding General, Army Services Forces, requesting the cooper- 
ation of the Aviation Cadet Examining Boards in the nine 
Service Commands in recruiting this group. j 
During the preliminary discussions in the Office of the Air 
Surgeon it was decided to require full qualification of this grouP 
on the regular physical examination. However, the surgeons 
9 the Aviation Cadet Examining Boards were told that if the 
applicant was otherwise physically qualified he should not be 
disqualified by reason of a low Adaptability Rating for Military 
Aeronautics. At the classification centers instructions were also 
given that no one was to be rejected from the group except for 
purely physical reasons. | 
Approximately forty Boards, representing all of the nine 
Service Commands and including all sections of the country» 
were authorized to recruit members of the experimental group, 
Each Board was given a definite quota. The quotas varie 
with the size of the population of the area served by the Boarc- 
The smallest quotas were for twenty aviation cadets and the 
largest for seventy-five. In establishing the quotas for t а 
various Service Commands the numbers recruited from that 
Service Command in previous months were also considere¢- 
This was especially important since some of the Service СОЛ!” 


EVALUATION OF PROCEDURES 447 


mands contained a number of Boards based at Army posts or 
Stations at which men already in the service could apply. The 
quotas for all Service Commands totalled 1450 men. It was 
believed that this would allow for a certain number of later 
Physical disqualifications and other losses and still provide a 
&roup of more than a thousand entering pilot training. 


Recruiting the Group! 


To insure that the personnel of the Boards should under- 
Stand the general plan and the specific procedures to be followed, 
during the month of July an officer from the Psychological 
Branch, Research Division, Office of the Air Surgeon, was sent 
to each of the Boards which had been given a quota. At the 
time these men were being recruited the normal procedure was 
to be sent to basic training centers for six weeks basic training, 
then to college for approximately five months pre-aviation 
cadet college training, and after that to preflight school for 
about two months. Following this the individual was sent to 
Primary flying or to one of the other aircrew specialty schools. 
| ince it was desired that the results of this experiment 
qud be available as quickly as possible, it was decided that 

€ pre-aviation cadet college course would be omitted for these 
— Accordingly, beginning about August 1, 1943, all appli- 

5 at the authorized Boards were given a statement to sign. 
Send et said, “I wish to enter pilot training. If I am 
Pilot qualified by the Examining Board I agree (1) to enter 
With training after a shortened period of basic military training 
со ut first taking the pre-aviation cadet college training 

үзе, and (2) to volunteer for induction within ten (10) days 
ix; ving the day on which I am found qualified by the Exam- 
“Ng Board.” For enlisted men a similar blank form was pro- 
or сы €xcept that it had no reference to basic military pee 
men volunteering. The examiner also read a statement to the 

› Pointing out the advantages to them of becoming aviation 


cad B . 

ilar. ve months earlier, of having the opportunity to earn 
E i er. 

NC A and of becoming officers that much soon p 

Proceg, ster W, Harri ible for planning the details of the recruiting 

i vain and for e Pics the AAF Examining Boards. In m a 

this rege these Boards and explaining the recruiting procedures to them he s 


Tes; We 
Ponsibility with William G. Mollenkopf. 


448 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


All applicants who signed the waiver were given the AAF 
Qualifying Examination and regardless of their score on this 
test were given a physical examination and an interview by the 
Board. If they were found physically qualified and had no 
criminal record they were qualified by the Board for aircrew 
training. Records on these specially recruited men were sent 
directly to the War Department. In Washington special orders 
were written sending a large group of them at one time to А 
basic training center with special instructions for their dispo- 
sition. 

From the basic-training center they were sent to a clas 
cation center where the Aircrew Classification Tests were given 
them. If found physically qualified they were sent into pilot 
preflight school regardless of the scores made on the Aircrew 
Classification Tests. The orders assigning these men to classi 
fication centers indicated that they were members of the expert 
mental group. Upon completing their classification processing 
they were sent along with other aviation cadets to preflight 
schools with no designation as to which ones were members Q 
the experimental group. 

Thus, in preflight schools and in the training 5С 
members of the experimental group were not identifie 
orders assigning them and they consequently received no $ 
treatment. Since the service records of these men did not 
tain their stanines for either pilot or other aircrew specialties 
the officers in charge of these schools were instructed to ware 
the AAF Training Command Headquarters for the dispositio" 
of any men whose stanines did not appear in their servic 
records. | 

Orders were issued from Washington on 1311 теп é 
by the various AAF Examining Boards in accordance with i s 
plan of this study. Of these, 1275 reached the AAF Classific" 
tion Centers and were given the Aircrew Classification Тар" 
The test results of these men were processed in the usual fashit 
and sent to Hq. AAF Training Command after their stani i 
had been computed. When the more thorough physical yj" 
nation was given at the classification center a number af me 
were found disqualified for aviation cadet training. 


assifi- 


hools the 
d by the 
pecia 
t con- 


recruited 


——— ә o ——— 


EVALUATION OF PROCEDURES 449 


Of this group 671 men were tested at Psychological Research 
Unit No. 1; 365 at Psychological Research Unit No. 2; and the 
remaining 239 were scattered among the seven Medical and 
Psychological Examining Units. A small number were dis- 
qualified on the Adaptability Rating for Military Aeronautics 
during the physical examination in spite of directions to the 
contrary. A number of others were eliminated at the classifi- 
Cation centers and no records were sent to Headquarters as to 
the reasons for their elimination. The remaining 1143 men 
were assigned to pilot preflight schools and this constitutes the 
Primary sample on which this study is based. 


Description of the Sample 


It is believed that the sample comprising the basic group 
or this experiment was thoroughly typical of applicants for 
aviation cadet training. The average age was a little more than 
twenty-one years with approximately 30 per cent of the group 
eighteen and nineteen years old. By far the largest age group 
Was nineteen, and 10 per cent were more than twenty-six. From 
the Standpoint of education 2 per cent were college graduates, 
cena ditional 16 per cent had had some college training, 58 per 
t were high school graduates, and the remaining 25 per cent 
om not finished high school, including 1 per cent who had never 
tended high school. 
and proximately half of them were recruited from the 0 
ex alf from civilian status. With regard to previous m 
perience, nearly 5 per cent had flown solo and an additiona 
Per cent had had previous instruction- About 58 per cent 
ad been Passengers in a plane but had received no instruction, 
m 33 per cent had never been passengers in a plane. Ín this 
ial 25 per cent were married, 74 per cent single, and 1 per 
3 widowed, divorced, or separated. ph. 
ег average score on the Army General Classification Test 
13.0 with a standard deviation for the group of 13.8. 
PProximately 10 per cent of the group achieved Army General 
sification Test scores above 130, which placed them in cate- 
1, and approximately 10 per cent obtained scores below 95. 
п this original group 58 per cent obtained scores which 


Was 1 


450 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


would have normally passed them on the AAF Qualifying 
Examination and 42 per cent which would have caused their 
rejection. The average score was a few points higher than the 
passing mark and the standard deviation was approximately 
that which had previously been found for unselected applicants. 

It is clear from their educational background, their Army 
General Classification Test scores, and their scores on the AAF 
Qualifying Examination, that this group does not represent a 
random sample of men of Army age. Rather, it represents 
approximately the usual amount of self-selection which can be 
expected in a group of applicants who have chosen to compete 
for a highly desirable job for which the requirements are rela- 
tively high both in terms of the examinations at the time of 
entrance and of the standards for retention in and graduation 
from the training schools. 

To check whether the physical disqualifications and ot 
losses at the classification centers had any important influence 
on the nature of the group, the average test scores of this group 
of 1143 were compared with those of the total group teste: 
For practically all of these tests the differences between the 
means of the two samples were less than one or two hundredths 
of a standard deviation and in only one instance did it excee 
five hundredths of a standard deviation. It was therefore con 
cluded that the losses in the classification centers had not intro- 
duced any significant bias in the samples. 


her 


The Results? 


Of the 1143 men who were assigned to pilot preflight 
582 were eliminated in primary flying training schools, б> 
eliminated in basic training schools and 24 eliminated 1n Bu 
vanced flying schools. The remaining 265 graduated from 
advanced flying training and were rated as pilots. Of ihe 
men eliminated, 99 were eliminated for academic deficiencies = 
preflight school, 591 were eliminated for flying deficiency at pe" 
of the three phases of flying training, and 65 were eliminate 
at their own request or because of their fear of flying 


ate super 
] Sections 


schools, 
83 were 


m ? The principal analyses of results were carried on under the immedi 
vision of Robert L. Thorndike and Walter L. Deemer in the Psychologica 
in Hq. AAF Training Command and Hq. Army Air Forces. 


2 a 


tests 


EVALUATION OF PROCEDURES 451 


remaining 122 men were eliminated for administrative reasons, 
including physical disqualification. Approximately half of these 
were climinated during preflight school. 

_ Thus in this group of applicants who were allowed to enter 
pilot training without any screening for aptitudes, interests, or 
ability, only 23 per cent were successful in completing the course 
of pilot training and becoming rated pilots. The question which 
the experiment was designed to answer was, “How well did the 
mitia] Screening test results, the various classification test 
Scores, and the pilot stanine predict which one of this group 
Would succeed?” 

Figure I shows the success of the pilot stanine in predicting 
Which of these applicants would be successful. Very few of the 
S and 9’s were eliminated in the training schools and of those 
that Were, many were eliminated for physical or administrative 
reasons which the tests were not designed to predict. Nearly 

alf of the 7% were successful in completing training, but only 
а quarter of the 4% and 5's and only a very small percentage of 
thie 25 and 3’s, None of the l's was successful in completing 
Pilot training, 

"HL chart in the lower half of Figure I presents a n 
US It includes only those cases with no previous TE 
бе = (no pilot credit) who graduated from pre 2 
clude зы entered elementary flying schools and ` сө X 
tesa, Tom consideration men who were omnes o 
alis i-is than flying deficiency or fear of flying. : ай, m 
in ‘Ndicates the marked success of the pilot stanine in p 

g Which men would graduate from flying training. 
tive i Figure II are presented some charts showing t а 
і alue of the printed tests which have substantia! weigh 

“termining the pilot stanine. The two best tests by quite 
and 2 Margin were found to be the Саар оаа 
tests eee Comprehension Test EL. V rta 

"Cre also found to be superior to any o! the 8. E 
deve] in Predictive value. Both tests represent novel idea 

LE Within the Aviation Psychology Program. | 6 
he, р Mechanical Principles Test and Jud ae 

Were also found to have substantial predictive value. 


he predic- 


452 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


EXPERIMENTAL GROUP 


VALUE OF AUGMENTED PILOT STANINE FOR PREDICTING 
GRADUATION OR ELIMINATION FOR ALL REASONS FROM 
PILOT TRAINING - PREFLIGHT THROUGH ADVANCED 


TOTAL NUMBER = 1183 бе *-65 
Percent Graduated 100 
summe, 22 = 


F 
VALUE OF PILOT STANINE FOR TI TION OR ELIMINATION КОВ ie oUGH 
TEAR OR OWR REQUEST FROM LYING 9 MARY THR 


DEFIGIENGY, FLYING _ TRAINING — РКІ RIENCE 
ADVANGED, EXCLUDING CASES WITH CREDIT FOR PREVIOUS FLYING EXPE 
ам 6n TOTAL NUMBER = 834 [I 
РСР Ш ДРУ E 60 а o. or MEN 
D 


32 


70 


CMAQ 
SCBG 
WSS 
ч. 
Ll Ё 


WRT 
: AN са " 


100 


[| эт 


p = ‘Percent Eliminated 


— LEGEND— 


ШШШ ATEO 
ELIMINATED FOR ELIMINATED FOR FEAR ELIMINATED FOR ACADEMIC GRADU 
ADMINISTRATIVE OR OR OWN REQUEST. Он FLYING DEFICIENCY 


PHYSICAL REASONS 


. Figure I 


EVALUATION OF PROCEDURES 453 


PRED 7 : 
РОТЕ Ире FOR SUCCESS IN PILOT TRAINING OF PRINTED TESTS 
NTIAL WEIGHTS IN DETERMINING THE PILOT STANINE 
анон was Yor Йй EXPERIMENTAL GROUP 
lor flying deficiency, fear and own request, pre-flight through odvonced pilot training. 


REHENSION IL 
En 


INSTRUMENT 


wo то о 
Percent Eliminated 


ME 

CHANICAL PRINCIPLES BIOGRAPHICAL DATA, BLANK -PILOT 
a pre's I, ai tnnt 30) 

Percent Graduated 

В а 


Percent оона PEE — 
" 


55554 +. 


SSS 
SSS 9 


m G S 


“`` 


М 


SSS: j Iann 


EJ To Ч 
Percent Eliminated 


SPAT, 

IAL, ORIENTATION IL SPATIAL ORIENTATION I 

Реле, то US PEE e a М) lcu 7 мыш 8-34 
d cool P^ 39, Al Кийме!» 309 MI 

го LT Lid Groduated е ^ T 

Ре : A 


SI 
SSS à BEEN SSNS E 
as 

: ib o 
- E 
E Brot вотна cost Percent * Eliminated 
relation coetticient t 

elation coetticient for elt canes, including physical ond 


Figure П 


remaining coses when савез with previous firi, experience ore excluded, 
dminiateative etiminees 


454 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The Biographical Data Blank (Pilot), and the Spatial Orienta- 
tion Test I were found to be of more limited value. The find- 
ings regarding the Mechanical Principles Test and the Bio- 
graphical Data Blank (Pilot) are of special interest because 
these tests are quite similar to tests which the U. S. Navy and 
the pilot committee of the National Research Council had 
found to be of value early in the war. These tests constituted 
the principal tests of the U. S. Navy in its pilot selection pro 
gram throughout the war. 

The Spatial Orientation Tests were developed by the Psy- 
chological Division in the Office of the Air Surgeon very early 
in the war and have continued in use ever since with very little 
modification. These tests involve the use of aerial photographs 
and sectional maps and were developed to measure perceptua 
aptitudes in the general area of alertness and observation which 
Preliminary analysis indicated were important for success 1? 
pilot training. The second part of the test, which involves the 
finding of areas shown by aerial photographs on a larger are? 
portrayed by a sectional map, was found to have more validity 
than the similar problem in which areas shown by aerial photo” 
graphs were to be located in larger areas also shown as aer? 
photographs, 

In Figure III are shown the pilot validities of a number A 
tests which were developed primarily for the prediction of nav 
gator and bombardier training success. All of these tests have 
been found to have substantial validity for predicting success 
in navigator training. The Dial and Table Reading Test 81У 
a better prediction of success in preflight school than any of the 
other tests in the Aircrew Classification Test Battery. 

Instrument Comprehension Test I, which is similar in some 
ways to the Dial and Table Reading Test, was found on ie 
basis of approximately 1500 cases to have a substantially lower 
predictive value for success in primary flying training oi 
than Instrument Comprehension Test П. Because of its high 
correlation with this latter test, statistical analyses indicate 
that it could be profitably used to suppress certain extraneous 
factors present in Instrument Comprehension Test П and t а 
improve the predictive value of the pilot stanine. Unfortu 


hus 


х 
d 


EVALUATION OF PROCEDURES 455 


PREDICTIVE VALUE OF PRINTED TESTS DEVELOPED PRIMARILY FOR THE 
EDICTION OF SUCCESS IN NAVIGATOR AND BOMBARDIER TRAINING 
Elias ca EXPERIMENTAL GROUP Aet 
L as for flying deficiency, fear ond own request, pre-flight through odvonced pilo! training. 
READING COMPREHENSION 
D x 


DIAL AND TABLE READING 
zo 


СС 
teow Percent Ge 
Ө. о 


7, 20 Comments aon 
ды 


69 E о 
Percent’ Elmnated 


MATHEMATICS A 
пме orth 0, an 
Gradua 


120, ан Санмен › 30) 


3o то, o 
Percent Elimincted 


INS 
TRUMENT , COMPREHENSION I 


Pe M, 
QC! Gag S PF * 36, an C) 
26 


o то o 
[I Percent ^ Elimeated 


то, ? 
^ Percent Eliminated 


BIOGRAPHICAL, DATA. BLANK-NAVIGATOR 


үмә PEE! оэ, ан кыт 11 
ent Groduoted | x 


[ E 


al е, 
бзен ©°' ано со, ng experience are excluded 
о! corn et ficient for remaining casas when coses with previous flying expe 


ion pmi 
COttlicient for all cases, including physical and administrative ehminees 


Figure III 


456 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


nately, in subsequent samples the correlation between the two 
tests was found to be smaller than had been originally obtained. 
Also, the validity of Test I was found to be somewhat larger 
for primary training than previously found, and even close to 
the validity of Test II for basic and advanced training. For 
preflight training Test I was superior to Test II in its predic- 
tive value. Thus its early promise as a suppression test was 
not fulfilled and it was later dropped from the battery, since 
other tests, primarily Instrument Comprehension Test П an 
the Dial and Table Reading Test, appeared to provide adequate 
coverage of the functions measured by this test. 

| The Mathematics Tests, the Test of Reading Comprehen- 
sion, and the Navigator Key for the Biographical Data Blank 
were especially useful as classification tests because of ther 
only moderate validity for predicting success in pilot training 
and their substantial predictive value for navigation training 
success. . 

The predictive value of the apparatus tests used in the 4 
crew Classification Test Battery at the time the experiment? 
group was tested are shown in Figure IV. It is seen that the 
Discrimination Reaction Time Test, the Rudder Control Test, 
and the Complex Coordination Test all have substantial pre- 
dictive value for pilot training. The Two-Hand Coordination 
Pest had somewhat less predictive value and the Rotary Tur 
suit Test was of limited value for this sample. The Finge" 
Dexterity Test was of course not weighted for predicting succes? 
in pilot training. 

The Rudder Control Test had the greatest predictive value 
for success in primary training schools and for predicting flying 
elimination When cases with previous flying experience were 
included. The Discrimination Reaction Time Test and the 
Complex Coordination Test were superior in predictive value 
to the Rudder Control Test for predicting basic training 25 
preflight training. The Discrimination Reaction Time Test a 
especially good for predicting success in preflight school. , E 

For comparison the predictive value of certain other апа 
bles is shown in Figure V. Tt is seen that there is a very make 
relationship between previous flying experience and succes? у 


— o—— — „ж 


EVALUATION OF PROCEDURES 457 


PREDICTIVE VALUE OF APPARATUS TESTS 
FOR SUGGESS IN PILOT TRAINING 
Elimindiion: was f н " EXPERIMENTAL GROUP 
as for flying deficiency, fear and own request, pre-flight through odvonced pilot troining. 


DIE 
RIMINATION REACTION TIME RUDDER CONTROL 
E^ ата 
s fea ARA ein м) 
a миы 


TWO-HAND COORDINATION 
pos M 
n AM кнн 


Сме Prete 33, 
Groducted 


ROTARY 
ARY PURSUIT 


Perce, (мо pret 
^! отш 3! AN Eliminate an) 


eR £8 
ESS KN 65 
[SW SS SSS o 


perience ore excluded. 
elimines. 


8 
when coses with previous flying er! 
s including physical ond cóministrolive 


Figure IV 


458 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


pilot training. Education shows a very much smaller relation- 
ship. The General Classification Test has some predictive 
value in this unselected group but would not add to the over-all 
accuracy of predictions of the Aircrew Classification Tests. In 
this sample, age and marital status have practically no relation- 
ship to success in flying training. 

The Adaptability Rating for Military Aeronautics appears 
to have some predictive value for pilot training. An intensive 
analysis of the interview sheets used by ten examiners at the 
San Antonio Aviation Cadet Center suggested that the princi- 
pal contributors were education, vocational achievement, inter- 
est in flying, national origin, and family income. There 8 P 
slight indication that the men who were rated as relaxed an 
listless during the interview were more successful in flying er 
ing than those who were rated as eager or tense. Neither te 
extent of hand tremor nor flushing were found to have prese 
tive value for success in pilot training. 

A number of statistical studies were carried out to eV 
the effectiveness of the Aircrew Classification Test Battery We 
the pilot stanine in predicting success in pilot training. id he 
containing the product moment intercorrelations of all of t z 
variables in the Aircrew Classification Test Battery WaS pr 
pared in order that certain analytical studies of combination? 
of tests and weights for specific tests could be studied in а es 
cise fashion. This table of intercorrelations is reproduced ^ 
Table 1. A number of analyses were made using the intere? nts 
lations in this table and the biserial correlation сое! in 
obtained between the test scores and the stanine and succes® 
pilot training. 

In calculating these coefficients, men eliminated for P 
and administrative reasons were excluded from consider? om 
The two categories consisted of 262 men who graduate Aen 
advanced training and 755 who were eliminated in arm 
primary, basic, or advanced schools because of academic e s 
flying deficiency, or fear of flying. The results of these апа У 
are reproduced in Table 2 below. 

Using this set of validity coefficients and interco 
the “best-weights” give a prediction of success in pilo 


aluate 


pysical 


пол. 


rrelation® 
t training 


EVALUATION OF PROCEDURES 459 


PREDICTIVE VALUE OF SIX ADDITIONAL VARIABLES 


EXPERIMENTAL GROUP 
Elimination wos for flying deficiency, fear and own request, preflight through advanced pilot training. 


EDUCATIONAL STATUS” 


college 


mign Senast 


Ол 


GENERAL CLASSIFICATION TEST 


[M 


104-109 


26-103 


Bo- os 
o- в7 


MAR ADAPTABILITY RATING FOR 
"TAL STATUS MILITARY AERONAUTICS 


100 


Percent Биттей 


be 
мема naa 
етра more n 
Mere tan 4 years ot cotiege. Оба of һәй cosets wos eliminated, 


Figure V 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


460 


Al 9r IC Hu 0с £r £C 8r Sr St ҮММУ 77 
sr oF £9 Ly ws 69 1$ ws ss Ir WOL AOAVV 12 
sg ad Ir LE Sy te 29 ES LN [3 asa uonvogisse[)) "UID 07 
6C 9с 19 T9 Ir S9 se 16 Ly ey Sung зора 6] 
10° 90" 60° SI Sr 10 6r S0 S0 0с 10 V9ITIWO Auawed 3U gy 
[р ue 8C + ar TE Da 0£ p» LE 07. апаз eun PL “SIC LT 
£0" £0" 6c Sc {Г ТЕ or 60" 0c {Г £0" 902149 1022000) PPNA 91 
0c Sr ТЕ or 6€ 6€ LE 0с og SE [y VIOLWO "р2000 xajduioy cq 
9r a 8с 6C 8с 6€ 8C 9r 6c 9c z0" VIOIWO "pioo рцең ол] pI 
{0` 40° 9r А 9r 9c 9r 60° Sr & 10` 901+ unsing AIIN fI 

0$ т ШЕ SE 9g oF aa 9c 6r £r 090210 Я Onewa cr 
05 6C 8c c£ 1C m £p 9c [ra £r 120/19 V sonvume] тү 
ET 6T 2 9g LF 6C ў 6€ TC 30 150599 "oju[ [erouor) QT 
0 8c re Sk Sp £F 8c oF ££ so" 991919 II duo) 3sup 6 
SE {© 9£ Sp {© ES or + £r 20" 951919 І ‘dwog 3sup g 
9f Ic LF Sp TE 6T 9g" Le 8r S0 90619 шид үеошецәәрү / 
oF iF 6c £r £9 6C Ir 6€ Sp 9r VIZ-cC9dO 3upeoW ‘L XA 9 
H EF T 8€ oF 9c Ir 0£ IC or НОО “dwog Surpeoy $ 
9c 9c 6€ 9p aa LE 6E 0£ £r 80° 40542 П ‘wang penedg p 
6r cc c ££ £V Sr Sy IC £r Iv 91089 J ueu( [enedg с 
£0" 80 0 SC — 3T 6C TI^ 60* [rA 9r sr 020939 зор 7 “Borg z 
£0 £I *0 S0" ©0` so др or 80° ID sr 020939 EN CIR 8019 [ 
[4 Ir or 6 8 2 9 s F £ г I әрод ә[депед 


2І01= М .dnos) јозиәшыәфху {о Sainsvayy 4241Q рио souiupjg 


I ЯТ9УІ 


SISIL uonvotisev;n) fo f10117]2410242]UT 


‘pagueyoun үрүү jo qS pue рү 
pur 109191900 oAnrsod е jeg; os pojsnfpe su3is WA 'speriostq 3urod ore so[qerreA snonunuoo ЧИМ suor 
poorpaid ү :smojjoj se papos sem 9[qeuga үү, "Sonneuor y Азер 10 Suey Auyiqeidepy 4. 


әзиеш1оуәй poo3 jo uonenosse Е 
-t[p110f) "әпер payoipaid ү ‹ѕѕәоәпѕ 


461 


10} pasn so[qej uoIsIaAUOD uo poseq вәшир1$ 
AVV 20] :0601= № Чә], UOILIYISse[>) [еләпәгу 10] 16/@1 = N "893 uon BOYISSE]D 10} :SA0]I0J se sdno18 
891025 1591 We 10} Spiepurig pue suraj audigaxq p231931u2 Чолу әзә[ 


6592218: jo uonorpa1d pue 


"'sa3eprpueo MIINE Jejn3əs 


"ә1025 31ed 10} 877] = М put *o1oos [£302 10] Q/ZIZ N “(AOAVV) uoneunuexy Sur4p[en(j 
19312] UO postq so10os 3181р o[8uis ur әле 


шоо әләм IEP yora uo sased [үе jo poisrsuoo рәзлодәг dnoin , 


+VWAV 22 


oF 9€1 £C Ic ve 40 Г sr £r Ir 40 
461 96v £C Sg go ST oF ET IH 9 0c [зо], qO4VV IZ 
n 661 66% IC Ze [49 SE 0° 9c 0c 40 3891, uoneagisse[)) UID) QZ 
Rl 07: WF $C ZE ГА 6v 6s [UA 85 oy aum зора Gr 
2 I 6+ O° er TC st SI 6c 8c 6c V9IIWO Кацәзхә(т 1o3urg gy 
m fol РӮ LT SE or Sc 4r £V ША £c 911949 эш], PLY “SI LT 
S 161 68% T T0" 68 Sr {Г e S£ 1 S80cIWO ]onuo)) зәррпұ 9] 
s 861 06+ gr Ө oz 6T £F [^N Ig 9€ VIOZWO “P1000 хәїшогу cq 
961 80% Ir 0c 89 БА Ze S£ Ig 9£ VIOIWO "P1009 pueH OMT, pI 
& 061 66% 410 40 £F 6c £c I£ 9 9£ S80I*dO amsmg Апоу є] 
z SUC 98$ ё Sg 6c To" IE £0" 0c 9r [0 090219 g Snewayepyy гү 
o 6T  88* 9r Fr 9c 90° 1С £0- ST er +0" 10/19 V songureqepg TT 
[| SOT 96% MA IE 49 60 8c 6c IE 8c 9r 15089 "oju [exouar) QT 
< tol fr Sr LE T9 sv ҮР ТА OF 6c А 991919 П dump) 3sup 6 
3 est 15 0 sr IF 8r or гг 6E 8c 9r 951919 I ‘dwog asur $ 
z £61 66% £l [^ $9" 10 [43 LN [^ 6£ 9c 90610 шид [porueqoopy / 
fy 961 96+ ғ PN SF 6r “LY or LE 8c 9r Ү12-22949 Supeow p y (p 9 
90€ SBH Br ES LE 50° 0 60` 0c 9r 60° HFI9IO ‘шод suipeay ç 
S61 0S ST KA 14: EU Le 0c oe 6c P 90549 П чән) јецейс y 
46T  £8* Sr TE £F 0c ce сг E 9c А 410542 I ung [enedg ç 
oT —£0$ Ic 50' 8g 80° сг 6c fc 8c Sr 020972 10га "ep ‘Borg z 
881 005 ог cr 60° Z0 or £0" £0' £0' T0" Gc0932 AEN "EET Boig Т 
ds W ec Ic 0c 6I 81 4 9I 5T ҮІ £I as EEA, 


CIO[=N „0109 1°йиәшмәфху fo ssinsvapy 4HHQ Ри? s2ump;g "gray поти) fo А 


(penunuo)) г ATEVL 


We o ды» Sy es ar a - TRE 


462 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 2 


in Pil ining 
The Predictive Value of Various Combinations of Tests for Success y; Pilot Train 
as Determined from an Experimental Group of 1017 Men 


Correlation coefficient with га 

training graduates-—eliminee 
(academic and flying de 3 
ciency and fear of flying 


Combination of predictions used 


Pda ca 0 v6 RR 660 
Best-weighted combination of aircrew classification 
tests for this Sample: „аьаа ааваа 690 
est-weighted combination of Printed tests in air- 
crew classification battery for this sample .... 641 
est-weighted combination of apparatus tests in 
aircrew classification battery for this sample .. 578 


which is characterized by a correlation coefficient .03 mo 
than that obtained from the particular set of weights use ed 
Computing the pilot stanine at the time these men ae 
training. It is known that correlation coefficients obtaine en 
this way tend to show some shrinkage in a new sample, ee 
though this sample is relatively large, as in this case. | We ae 
conclude that the weights in use at that time were fairly © 
to the optimal ones, dict 
As it is indicated in the table, it is also possible = qp 
success in pilot training with printed tests alone with P» : 
racy only moderately diminished, a correlation coefficien — 
smaller, than with the complete battery. Using the appar : 
tests alone the Corresponding reduction in the coefficient E: e- 
A type of problem frequently encountered in selectio is 0 
search is the question of the effect of selection on the т" о 
one variable on the Predictive value found for a secon lation 
Scores. To make an empirical check on this, biserial einai 
coefficients were computed excluding all of those pes 4 
who would have normally been rejected on the basis of : а ob- 
Qualifying Examination score. The correlation "Pr those 
tained for this &roup of 540 men were compared is uccess 
obtained for the uncurtailed group of 1036 in predicting в the 
in preflight and Primary training schools. It Wis wi tricte 
average coefficient was approximately .05 lower in ene in 
group. The validity of the pilot stanine was also .0 
this curtailed group. 


EVALUATION OF PROCEDURES 463 


A special study was made of the aircraft accident records 
of this Broup. Of the total group of about a thousand men, 
twenty had aircraft accidents in training planes in the AAF 

Training Command. There were five accidents that involved 
Pilots with pilot stanines of 7, 8, or 9. These higher stanine 
8roups produced approximately a hundred of the graduates 
from Pilot training. The lower stanine groups, which produced 
one hundred and fifty graduates, had a total of fifteen accidents. 

,, “our of the accidents were fatal and these all involved indi- 
viduals in the lower stanine groups. For the four men involved 
In fatal accidents, the stanines for bombardier, navigator, and 
n 9t training Were, in that order, 324, 636, 445, and 996. The 
Pe сь Were all violating flying regulations at the time of the 
eme. : 8. [he fourth individual overshot his turn from base- 

Па! approach in lining up with the runway. In trying 
ack, he stalled out and went into a half-snap. 


n took over but the plane hit on the left wing 


Detailed Individual Follow-Ups: 


iius and classification program involving the test- 
ollow- 


se sight of “Se of hundreds of thousands of men, it is easy to 
the het e individual man. Because of the special nature 

vidu we Ра. group and the extensive amount of indi- 
clieveg aes ready collected concerning these men, it was 

viduals, X ines to make an intensive study of certain indi- 
Studyin i Was believed that most could be learned by 
Cordir € cases for which the predictions were not fulfilled. 

With loe Y а group of thirty-one men, including fifteen men 
ixtee Stanines of 8 or 9 who were eliminated from training 

fro . n men with pilot stanines of 2 or 3 who graduated 
folloy_ Ming, Were made the subjects of a special individual 
Trainin P conducted by an aviation psychologist from the AAF 
Sach of t ommand. Complete case studies were prepared for 
'ncludeg € thirty-one individuals. The sources of information 
See (a) Psychological records of test Scores, interests, and 


d tailed E S 
m E. Waltedi¥idual follow. 


nas 


up study reported in this Section was conducted 


466 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Implications 


This study of 1,000 applicants and their success 2 ae 
training in relation to their scores on the selection and ren 
cation tests has clearly demonstrated the effectiveness н 
procedures when applied to groups of men WU sd 
civilian life or from the Army. Of 405 men who failed or a 
AAF Qualifying Examination and were subsequently rs jos 
pilot training, only twelve achieved pilot stanines of 7, bns 
and only four of these and forty-one others of the — amm 
500 men who failed the Qualifying Examination were gradu 
from pilot training. jfica- 

The value of dis second screening by the Aircrew Gus 
tion Tests was dramatically demonstrated by the gum 
of only sixteen men out of 442 with pilot stanines of 1, 2; та 
sent into preflight training. At the same time 113 men pre- 
ated of the 199 with pilot stanines of 7, 8, and 9 sent 1n 
flight training. ilot 

"The e PN coefficient of .66 obtained € uei 
Stanine and success in pilot training compares favora ina 
the best predictions which have been obtained m де йр 
and industrial work. It now appears that further pd fail- 
of instructional techniques and procedures for pasong : unt 0 
ing students needs to be made before a substantial amo 


res 
+ А " : госеїш 
further refinement in the selection and classification p 

can be expected. 


= 


~ 


MEASUREMENT OF ATTITUDES TOWARD 
COUNSELING: 


WILTON P. CHASE 


Veterans Administration 


Introduction 


i ANALYSIS of characteristics which a good counselor 


ар ee eei would include a better-than-average degree of 
Proper s a education adapted to his needs for developing 
ance, a m Ww. edge and skills as a counselor, experience in gui- 
Standing adhi well-adjusted personality, an interest in under- 
Vocational , oe others with their problems of educational, 
insure the at personal adjustment and attitudes which would 
Paper is con Е application of counseling procedures. This 
cerned with the feasibility of measuring the last of 


acteristics in an objective manner. 


О Method 

. Une i 

Ing pre andred and one statements of opinion toward counsel- 
: e 


tion an dures in relation to the Army's Separation Classifica- 
the form, counseling Program were written and submitted in 
of their ee ч Preliminary scale to 34 judges, selected because 
Гаць» bere understanding of and ability in counseling. The 
Pinion as to made on a five-point scale according to their 

whether the practice described in each item was: 


ese char 


l. Decidedly harmful 
2. Probably harmful 
3. Of doubtful value 
ae 4. Probably good 
& 1The ‚5. Decidedly good 
Try; aut] . а А 
selina E out then Sratefully acknowledges the assistance of Dr. Britten L. Riker in 
The Procedures employed in constructing the attitude scale toward coun- 
Sarily 1 Opinions n. . 
"ation Present the rpressed in this paper are those of the author and do not neces- 
s ia 


Views of the War Department or of the Veterans Adminis- 


467 


468 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The ratings of the judges were tabulated for each item. 
Where there was no majority agreement upon how a statement 
should be rated, it was eliminated. The statement which fol- 
lows, with the distribution of the judges’ rating, is an example: 


Defining the Interview Situation in Terms of 
Diagnostic Procedures 


Scale Value Number Rating % Rating 

l. Decidedly harmful 7 20 

2. Probably harmful . 7 26 

3. Of doubtful value . 2 26 

4. Probably good ... 9 7 

5. Decidedly good ... 2 

| ing of 
Where there was a clear majority for a particular тт 

a statement, it was retained and was scored in the fina 


. a . . ы 1 was 
on the basis of crediting one point if that particular rating 


checked. An example follows: 


Indicating the Topic but Leaving the Development of 


the Story to the Counselee 


28 
3 o Rating 
Scale Value Number Rating dae: 
l. Decidedly harmful .. . 0 0 
2. Probably harmful ... 0 12 
3. Of doubtful value ... 4 56 
4. Probably good ... : 19 32 
5. Decidedly good ................- 11 
Ju 
= icular eV? 
Where there was no clear majority for a particula equally 
» Жо d rm u 
ation, but where the majority of the opinions were abe t was 167 


divided between two adjacent ratings, the seme scorin£ 
tained and either rating was credited one point in the 
of the item in the final scale. An example follows: he 


А . e 
Seventy-four statements remained in the final pe had 
scale was then given to 180 students of Class No. ine 


min 
of deter tis” 
2 Tt was necessary as a practical expedient to adopt the method | refine stags 
scale values which is described because time did not permit of а mc f the 14° ihe 


М mi i оп p 
tical procedure, such as determining the mean and standard deviat! alue, ° 


: k 1 1 
Inspection of the distributions of the judges’ ratings for statements 
tained in the scale indicated that the variability was small. 


= a 


MEASUREMENT OF ATTITUDES 469 


Expressing Disapproval of the Remarks of the Counselee 


Scale Value Number Rating % Rating 
l. Decidedly harmful .............. 13 38 
2. Probably harmful ... 13 m 
3. Of doubtful value ... 6 18 
4. Probably good ..... 2 7 
5. Decidedly good .............-..- 0 0 


Completed the course at the Separation Classification School, 
vort Dix, New Jersey. The course consisted of four weeks 
intensive instruction of six to eight hours a day, six days a week, 
in principles and techniques of interviewing and counseling, 
individua] differences, testing, educational and occupational 
Information, counseling the physically handicapped, use of the 
Ictionary of Occupational Titles, civilian referral agencies, 
Army classification procedures, preparation of the Separation 
Ualification Record (WD AGO Form 100), and other infor- 
mation considered essential to train military counselors for 
"IY in Separation Centers and hospitals. i . 
Students to attend the class had been selected after meeting 
Certain minimum qualifications, namely, completion of two 
uate of college work, a minimum of two years’ eo 
е Phase ог field of personnel or closely allied mum MV 
a (three years’ additional experience could Y qe 
mu for lack of educational qualification if necessari ha e d 
th m age of 25 years, and a standard score of 110 or be 


В itical 
"ту General Classification Test. Due to the сп 


"aorta it was neces- 


age of personnel meeting these requirements, : 
B individua] cases to relax these standards in pees 
Чагы Cories in order to fill the established шыга 
Conga. D Beneral, students meeting these require s 
uno to be potentially qualified as military counselo 


: , " 
uh Successfu] completion of the program of instruction a 
8 schoo] 


9r the х in addition to the scores 
Sbtaineg Purpose of this study, 


there ү ОП the scale of attitudes toward жы 07 расіна 

"bon p available the final class averages W -> Aci о, 

Class Pur hours of objective testing covering al! p be 
3 instruction offered at the school and the prac 


470 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


work which accompanied it. These are conversions of raw 
scores on the examinations to a numerical rating on the basis of 
100 constituting the highest grade which could be earned. 

Also available, and employed in this study, were the stand- 
ard scores on the revised form of the Army General Classifica- 
tion Test which was given to the students as a group (171 of the 
180 students were present for this test). 


Results 


| е , ? ее- 
On the basis of the ratings of the 34 judges there was om 
ment upon the value of certain counseling practices, as folto 


Practices Judged Probably Good or Decidedly Good 


Scale 
Values 
Warning the counselee of the dangers of failure in a new endeavor reaching 
Defining the problem in terms of the counselee’s responsibilities 
decisions 


Recognizing the counselee’s expression of feelings : T 
Summing the problem for the purpose of giving remedial techniqu 
Leaving the development of the story to the counselee 
Responding so as to indicate familiarity with the counse 
Indicating that the decision is up to the counselee E 
Signifying the acceptance of a counselee's decision when in à 
Signifying the rejection of a counselee’s decision when it is factu 
Identifying a problem through evaluative remarks resulting from 
pretation 


xplaining the source of difficulty by evaluative statements ach an adjust" 


lec's problem 


ement 
gre Ally wrong 
inter- 


gea 


MUU RUA UR л Am Un Ub PUA UD 


test ! 


4, Toposing an activity that the counselee should engage in to ге 
ment 
4,5 Pointing out a problem or condition needing correction К 4 
Recognizing the feeling or attitude which ER counselee has epre havior or 
4,5 Interpreting feelings expressed in general demeanor, specife ii 
earlier statements to further rapport and solution of a pro " 
Discussing information related to the problem , onsibilitY fo 
4,5 Defining the interview situation in terms of the counsclee's resp FT, 
using it А s ently Cf! 
Listening to the counselee in a patient and friendly, but intelligently 
manner 
4,5 Helping or aiding the counselee to verbalize his thought = 
Probing in unexplored areas to encourage verbal responses fect his relatio 


Relieving the counselee of fears and anxieties which may a 
to the counselor itted 
Veering the discussion to some topics which have been omitte 
Accepting the counselee as he is 
reating a friendly relationship with the counselee 
Permitting the counselee to express himself freely 
Assisting the counselee to analyze himself 
Objectifying the problem for the counselee in general terms 
Showing warmth of feeling Я 
Displaying responsiveness to the counselee’s attitude elee 
Indicating that purpose of the interview is to help the couns 
Clarifying the area where decision is needed 
Fostering emotional maturity toward the problem 


ою 


"-— 


po 


I BH 


MEASUREMENT OF ATTITUDES 
"S ds 
5 Referring the counselee to specialists in various бе] m 
45 Simplifying A problem e with the potential pitfalls in civilian life for the 
4 Acquainting the disc harge 
veteran А ч ee's level 
5 Adapune the level of conversation to meet ше шше.) the Form 100 
5 Orienting the dischargee to the purpose of co he will have to accept more 
4 inting to the dischargee that as a civilian he v 
responsibility BL sem. vi 
4 Showing the counselee where he erred in hix plan EE. GL Bill 
a5 Cquainting the dischargee with the prov ыы битсе ӨЕ АСЕН 
5 iving notes to the counselee which шш coe jbb sekme 
5 Advising the dischargee to take his, Form lian lif 
1 Listenin without comment to gripe: ll mesbanucivilianclite 
4 Teparing the dischargee for the indifference be pall a reticent, shy; or sullen 
5 sing techniques which will elicit responses evel 
counselees 
Practices Judged Doubtful 
Scale 
alues 
3 Callin by his first name 
zaling the counselee y his les 2 
$4 "XPressing approval to remarks of the соода ehiereourisdledé position 
24 Interjecting general thoughts which p Шыгы Session is based 
ч S 'Scussing assumptions upon Dm the co | | 
mpathizi: zi unselee ns rans’ organi- 
3 eY™Mpathizing with the е0 Bae osa ыы ынша ийе 
Uggesting to the counselee that as a 
zations 2. 
E IScussin, conomic conditions and problems 
3 m ing general ccone ! blems 
3 a Scussing general political and racial pro 
visi 


i take chances 
ng the counselee to stay on the safe side and not to 


Practices Judged Probably Harmful or Decidedly Harmful 


] 
Becoming emotionally involved in the а ECS problems 
Shaming ne counselee who complains of bad brea fecling 
Side-stepping the counselee’s expressed attitude, or 
Expressing disapproval to remarks of the a 
€ndering Moral admonition to curb anti-socia ескш, dor dili counsele 
Arguing points of disagreements in order to clear 
P NE progress . 
quitting the counselee on the defensive 

= ing а ае worries of educational benefits for 

Rcouraging all dischargees to take advantage 

ns 


нын 


2 Дерпт i ression 

1 shetimanding the counselee for developing ARMEN PM 
2 Brest uem) ре ржа ifie n adjusting to civilian life 
2 LebBesting that every dischargee has difficulty 


immediately go back 
Paving the dischargee with the idea that he should imm 
to Work 


i se is indicated 
fusing to discuss a problem because no clear cour 


з m task Я i 
n “rating the value of the counselor’s own e his problems until he is 
Advising the dischargee to wait before talking ove 
A а civilian 
Ssumin i i 
Si Б à superior attitude В іте 
Adessteppine Important problems for lack of tiri of a problem 
Yoiding ing more than scratching the surfa ibes medicine 
"scribing а Course of action as a doctor prescr 


472 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The students’ scores on the counseling attitude scale, their 
grades and their standard scores on the Army General Classif- 
cation Test (Revised) approximate normal distributions. The 
lowest and highest score, the average score and the standar 
deviation for each of the measures are as follows: 


Lowest Highest Average с 
Counseling Attitude Scale ... 6 59 48.1 6.75 
NOE S оз MP | 156 125.7 1423 
(cT QI же шшнйншийшш › 43 94 79.95 7.10 


The range of scores on the attitude toward the counseling 
scale is from 6 through 59 with a maximum possible score QI (s 
which indicates that no student in the group approached the 
test ceiling. The distribution of scores is foreshortened at the 
upper end. Even at the conclusion of an intensive course 
counseling, the group as a whole had a long way to go in Ea 
veloping attitudes toward counseling when compared. en 
Ey opinion as it is represented by the items included in ї 
scale. 


. B "pt ж М cale 
The coefficient of reliability for the counseling attitude § Я 
obtained by correlating scores on odd versus even items 15 ula. 
which is raised to .77 by applying the Spearman-Brown p" 
Correlations among the various measures employed 17 
study are as follows: 
Grades а 
Counseling attitude scale .......... 24 9 
Grades M—————— M бй : 
Е . . coun- 
The partial coefficient of correlation for scores Oe the 
ship 


seling attitude scale and grades with their relation 
standard scores on the AGCT held constant is 15. 


Summary ЕР 
е the following 
le toward aad 
judges a 
ts emploY* 


The results presented in this study indicat 

1. It is possible to construct an attitude sca 
seling practices based upon agreement of qualified 
the value of the practices described in the statemen 
in the scale. 


st 


0 
d 


—— —————— ee se ES Mint a 


ex 1 


А 


MEASUREMENT OF ATTITUDES 473 


2. The results obtained from employing the scale to measure 
the attitudes of a group of highly selected adult students com- 
pleting an intensive course in counseling show little correlation 
With their academic standing or with their scores on the Army 
General Classification Test ( Revised). 

3. The acquisition of effective attitudes toward accepted 
counseling practices is not related to the scholastic ability of 
Students to the same degree as is their achievement in learning 
Counseling information and techniques. 

4. The results of this study tend to bear out the hypothesis 
that for beginning counselors some time is needed for them to 
earn fully to appreciate the significance of effective attitudes 
toward counseling, which probably can be acquired only 
through actua] experience in the counseling situation rather 


t i і 
à an through a study of counseling techniques in a formal 
ourse of instruction. 


RESPONSE SETS AND TEST VALIDITY 


LEE J. CRONBACH 


University of Chicago 


A PSYCHOLOGICAL or educational test is constructed by 
choosing items of the desired content, and refining them by 
empirica] techniques. The assumption is generally made, and 
Validated as well as possible, that what the test measures is 

‘termined by the content of the items. Yet the final score 
f the person on any test is a composite of effects resulting 
dae content of the item and effects resulting from the form 
shen a used. A test supposedly measuring one variable may 

measuring another trait which would not influence the 


Score j » . 
Те If another type of item were used. This paper attempts 
the өр. йл these influences in a variety of tests, to examine 
ette 


a ct of these extraneous factors, and to suggest means of 
"trolling them. 


var peons studies show that scores may be influenced by 
“Tab 


nesota nee than the one supposedly tested. In the Min- 
explicit] à ultiphasic Personality Inventory, for example, it Is 
Y recognized that a subject may evade questions by 
his stil io use of the response “Cannot Say,” even though 
«Т al behavior might be properly described by the response 
“False.” This tendency invalidates the test profile 
S who show an extreme number of “Cannot Say” 
Another example is the influence of acquiescence on 
st performance (2, 4). Under many conditions, 
ate scores are obtained on the true items and on the 
the t on of the typical achievement test, the correlation of 
liable Scores is near zero even when the test as a whole is re- 
the fal n other words, items in these two forms, the true and 
the үт do not measure the same trait. This is attributed to 

“dency of so d to respond “True” when in 

me students to resp 
475 


the exc 


f 
Or Person 


c POnses. 
Ue-false t 


Ping Separ 


478 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


unless they were very sure of the direction of the deviation. 
It had at first been thought that men who failed had poor 
discrimination ability, but it became apparent that the test 
measured both this ability, reflecting the content of the item, 
and set to ignore small differences, introduced by the form of 
the judgment. In fact, the training problem was to teach the 
response set, to produce uniform reports, as much as it was 10 
train judgment. 

2. Definition of Judgment Categories.—Different. persons 
assign different meanings to the terms used in responding: 
"Yes," "Strongly agree," etc. This problem overlaps that of 
caution, just described. In P.E.A. Test 8.2, “Interest Index 
(24, p. 338), which calls for an “L-I-D” response, two students 
enjoying a particular area equally may have different “Per cent 
Like” scores, because one defines “Like” as any degree of a 
ceptance rather than rejection while the second reserves “Like 
for those things he has a real desire to do. Simpson (23) an 
Mosier (17) have shown individual differences in using words 
such as “frequently,” “indifferent,” and “desirable.” Mosier 
found these differences to be reliable. Osgood reports that опа 
seven-point scale some persons predominantly use position 
and 7, some use 1, 4 and 7, while some use the whole scale (18). 
Using intermediate rather than more extreme scale position? 
is used as a reliable index of behavior, (“caution in drawine 


conclusions”) in P.E.A. Test 2.51, “Interpretation of ГЕ 
(24, pp. 62-65). The various writers have attributed mes 
nce 


differences as possibly due to true personality differe 
caution or conservatism, intellectual differences such as 
thinking, or to differences in word meaning. -ted 
3. Inclusiveness—IĪn some tests, the student is permitto 
to give as many answers as he desires, as in such essay guas i 
as "Point out differences between” or “List the causes 0^» _ o 
An open question of this type may elicit an extensive ue J^ 
points from one student, and a short selected list from anot 
Which receives the better score depends on the scoring met M 
but the score may reflect technique or set in answering °° sy- 
as ability. The same possibility occurs in some objectiV? tu^ 
aminations. In P.E.A. Test 1.41, “Social Problems,” the 5 


critica 


"et 


RESPONSE SETS 479 


dent is given a problem, asked to select his choice of courses of 
action and to check reasons to support his conclusion (24, pp. 
180-190). Some students check many reasons, some few. But 
one who checks many reasons is likely to receive a higher score 
in Irrelevancy or Inconsistency than one who lists only the 
Гёа$оп he considers truly basic to his opinion. This may bea 

asic trait in the student's reasoning, but it may as easily be a 
reflection of the way he interpreted the directions and the in- 
tent of the test. While inclusiveness need not confuse interpre- 
tation when the pattern of scores is studied as a whole, it does 
Prevent meaningful treatment of a single score, such as Irrele- 
vant Reasons, The P.E.A. Tests of Application of Principles 
in Science (24, pp. 80-111) also permit inclusiveness to affect 
Scores, since the student is permitted to check as many reasons 
e mi: Wishes to support his conclusions. (But cf. 24, p. 117, 
Where a test was redesigned because inclusiveness interfered 
With validity.) 


a в Thurstone-type attitude scale permits inclusiveness to 
id Score. "The subject is directed to check those statements 
1с $ 


Some persons check only two or three statements, 
heck several. The score is the median scale value 
atements checked. There is a tacit assumption that 
Propriate checks additional statements beyond im pr eod 
But then ones, they are balanced around the median o : $ 
.I* 18 no evidence that these additional statements lo 
Some Ma the person's score nearer to the group ire 2: E 
Cheeky: er way modify it. The same tendency can be found i 
S of any sort, 
"45; acquiescence.—When students are offered two al- 
» as “True” versus “False,” “Like” versus “Dislike,” 
other. versus “Disagree,” etc., some use one more than the 
and n 18 effect has been demonstrated in the мос 
Чеге sonality and attitude tests (2, 4, 15, 19). Indivi ча 
ба in acquiescence (tendency to use “Yes” ог True ) 
ble by split-half and parallel-test-with-elapsed-time 
"Yes bs € majority of students have an excess number of 
affect an Ponses on true-false tests. Since response tendencies 
answer only when the student is to some degree un- 


480 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


certain about the content of the item, acquiescence tends to 


make false items more valid, and true items less valid (2, 4). 
The poor student, guessing, tends to be right on the true item 
because of acquiescence, but tends to be wrong on the false item. 
False items alone are often as reliable and valid as the entr? 
test of double the length. A large amount of acquiescence tends 
to reduce the deviation of the student's score from the mean: 
Rundquist found personality test items where Yes” герге 
sented a negativistic answer more valid than those where “Yes 
was favorable (19). 

| Another instance of individual differences 
in a pitch-discrimination test. A recorded experimental test, 
roughly similar to the Seashore Pitch test, was given to three 
psychology classes. Students were directed to mark each item 
H or L according as the second of a pair of tones was higher OF 
lower than the first. The test was very easy, the median nun 
ber right being 94. Individual students, particularly those e 
poor scores, showed marked tendencies to overuse one V» d 
two alternatives. One student had 18 errors of the HmL e 
“higher marked lower") type but only three LmH; another ^* : 
16 LmH to 5 HmL. More definite evidence was found on kr 
last twenty items, which were near the pitch-difference thres 


á er^ 
old for most students. In general, response sets influence P 
ctured 


in bias appears 


five L items, and two containing five H items. 
of L items had equal difficulty, as did the two sets 00°. re- 
Scores were obtained for 133 students. If there is bias 1” þer 
sponding, the number of correct H answers exceeds the ere 
of correct L answers for the student, or vice versa. This ight 
is represented by the formula H – L (where H is number from 
on H items). Out of a possible range, for twenty items, ean 
10 to — 10, actual bias scores ranged from 5 to - 7. The As 
bias was negligibly different from zero. The correlation 
scores for the split tests were as follows: 
Ишти р 1» 
SH+5Litemsx$H+5 L items, 0,46: corrected, 0.63 (reliability of F25 score 

H - L, 10 items x H - L, 10 items, 0.33: corrected, 0.49 (reliability of bias 


Н +L, 20 items X Н — L, 20 items, - 0.125. (score x bias 
10 H items x 10 L items, 0.07; corrected, 0.13 


y. * t^ PES 


"—————————— E 


RESPONSE SETS 481 


The bias score is definitely reliable (the probable error of an r 

of .00 is 0.06 for 133 cases). Bias is nearly independent of 

Pitch ability. The reliability of the bias score might be a re- 

flection of the fact that the superior student makes few errors, 

and so has a reliably low bias score, but when only cases making 

a errors out of twenty are considered, the corrected 

ability of the bias score is 0.56. 

twenty-item test of Æ items alone would have a predicted 

E жы ы a test of twenty L items would have a re- 

reliabili = 7; yet the total test, with ten of each type, had a 

Sivit arely as good as the 10 L items alone. The test 

н c a two factors, pitch and bias. Wyatt has 

training f 4, p. 41), and the writer's experience confirms, that 

to d nid Superior discrimination requires deliberate effort 
cessful ш these biases. Seashore was evidently not suc- 
actor und esigning his test to satisfy his demand that the 

Hans. а consideration (pitch) must be isolated in order 

W ay know what it is that we are measuring" (quoted 
yatt, 32, p. 15), 

Single oe that the twenty items are not measuring a 
termine кый DM is of interest in the light of attempts to 
ata (11) E actor structure of the Seashore test. Guilford's 

Statistica] 2 been questioned by Wherry and Gaylord on 

etect 2 you (28), but it is also apparent that he failed 

Sume hung t e factors in the test because he erroneously as- 

the Same all items having a fixed pitch difference measure 

Position trait, whether the higher tone is in first or second 

of differe 15 data, revealing different factor loadings for items 

bias а difficulty, are consistent with the hypothesis that 
iy vie more important as difficulty increases. | 
subm ay result from a deliberate mental set. In testing 

Ween E detector personnel for ability to distinguish be- 

Yielqea € indications of a submarine and false indications 

ES md the detector, a test was set up which reproduced 
Yes» alse indications, the man being required to report 

Present NO,” according as he believed a submarine to be 

t. Even the best operators tended to show a 

las toward the “Yes” response which reduced their 


reliab 


anti- 


482 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


scores because they reported many false indications as sub- 
marines. When attempts were made to train for higher per- 
formance, they defended their errors on the ground that the 
only safe course in operation is to consider all doubtful cases 
as submarines until proved otherwise. Despite the fact that 
accurate judgment was desirable to reduce false alarms, it was 
found impossible to eliminate this set in order to test discrimi- 
nation ability. 

5. Speed versus accuracy.—1n many tests, speed is an 1 
portant element. In taking such a test, the student has his 
choice of responding carefully, or of answering rapidly, achiev- 
ing a score through quantity rather than accuracy. Although 
empirical scoring devices may compensate for this effect over 
a group as a whole, a given individual’s score depends on his 
set to be rapid or to be accurate. This influence is especially 
serious on a test such as the Nelson-Denny reading test, which 
presents five-choice items scored by the formula “number 
right.” The writer recently reviewed scores of a class of tenth 
graders. One student, selected by his teacher as а retarde 
reader, and having a test IQ of 69, appeared in the list of score 
at the 60th percentile on the Nelson-Denny vocabulary test: 
He had merely rushed through the test, guessing at every nns 
by chance, he had answered twenty-five out of 100 items i 
rectly. Even the best correction formula can only estimate 
much a speedy, careless student would earn had he been вше 


n im- 


ful. The criticism offered here on the Nelson-Denny с чен 
others of like pattern applies primarily to the meaningfu ing 
co 


es, the S 


of scores; in a test designed to predict grad ар 
he majority 


procedure used may be justified empirically in t 
cases. d- 
Hall and Robinson (12) made a factor analysis of 25 Pn 
ing scores, and concluded that the first factor in their ж 
was best named “attitude of comprehension accuracy” Шу 
response set appeared more prominently than the a 
factors. or 
6. Response sets on essay tests.—Anyone who has хакер оп 
given an essay test is faced with the effect of different os a 
scores. In addition to inclusiveness, there are PT? и 


ind 


RESPONSE SETS 483 


many different response sets as there are styles of composition. 
The student may write a carefully organized response, or he 
ее а musemsni-Qurseti шш answer with no effort 
“>. = жез. W hether he receives as much credit as an- 
г Student with a different set depends on the method of 
Са а credit is given for organization, the former set is 
may uiia -— for number of ideas expressed, the second 
"4 eie im higher score. Some students attempt to 
orm, Sor. П essays, while others merely list points in a skeleton 
lustrationg e oo elaborate their answers and bring in il- 
sponse. Wi 5 ule others present the bare bones” of the re- 
set, and е type of answer is given appears dependent upon 
tries to “ot pe ay s idea of what is desired. The wise student 
Procedure x type of answer is favored, and adjusts his 
Unless the xn but this adjustment cannot take place 
the teachers d situation itself is altered by providing a cue to 
reaction ; esires. Campbell has discussed individual modes 
9 open questions for public opinion polling (1). 


Characteristics of Response Sets 


Indi 

iv А : я 

Merous ae differences in response sets are reliable. Nu- 
ucies 


Sets erasa have shown the reliability of differences in the 
bir sen, d above, by internal-consistency tests. Several 
type een ts have measured response sets in tests of a given 
"Sponge 118 several tests, sometimes with separation in time; 
b; Mus d have shown substantial correlations. 
{чоч oru Sets have the greatest influence on score in am- 
°F the mie ce IU situations. If a situation 1s structured 
SPonds q; Ut so that he knows the answer required, he re- 
Probab] rectly to the content of the item, and response sets 
18 respo are unimportant. If he does not know the answer, 
Sets, nse is determined by caution, acquiescence, or other 
Че 1 °Чшезсепсе appears on difficult true-false items; bias on 


Ments Judgments; and evasiveness on attitude judg- 


t pitch 


3y be ines the student has no strong opinion. Ambiguity 
w н Cased by the test situation or by directions which 
hethen moot to judge whether guessing is penalized, 
ed is more rewarding than carefulness, how many 


484 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


statements should be checked, or what is meant by “Indiffer- 
ent. The problem is reminiscent of those encountered In 
rating scales, where traits and scale positions must be made un- 
ambiguous to obtain validity. No “objective” test 1s truly ob- 
jective, so long as any part of the stimulus situation 15 suffi- 
ciently unstructured to permit individual interpretation. 
Degree of structuration varies from the item where the response 
is obvious for the group tested, to the one where the student 
has no idea what is wanted. 

Response sets affect test reliability. Because they are ee 
sistent, response sets may heighten reliability (30). In ot n 
situations where the response set lowers the correlation Bere 
items, reliability appears to be lowered. Where a response x 
such as gambling versus caution, increases the spread of € 
reliability will tend to rise. Where a response set such as re 
reduces the range of scores, the reliability may be expect® 1 
decline. The relative reliabilities of the trait under study mit 
the response set, and the variation of the group in each, m 
also be considered. set 

Response sets affect test validity. Since a respont lity 
permits persons with the same knowledge or attitude or à z г 
to receive different scores, response sets always lower the pi 
cal validity of a test. Empirical validity, based on ability e 
predict a criterion, may be raised or lowered, depending °? to 
correlation of the response set with the criterion. Tendency, he 
gamble may reflect primarily confidence, in which sg Й 
better student might be less cautious when in doubt, an 
crease his score; but should willingness to gamble an уой 
have no relation, or a negative relation, empirical validity S 


es on 
be lowered. In a test of morale, where the person Г o 
NC NC E : 1 
“Yes” or “No” to pessimistic predictions, acquiescence pea only 
n 


correlated with everyday morale, because it influences nord} 
the test performance, but also the readiness to believe ru ith 
in this case, the response set could raise correlations 
criteria. 

Response sets interfere with inferences from tes 
becomes difficult to judge learning difficulties from » 
analysis, since response sets influence the percentage of st 


t data. 


1 
dent? 


D 


RESPONSE SETS 485 


Passing an item. It is difficult to evaluate pupil growth when 
response sets affect score as a major change of score may reflect, 
not growth in knowledge or change in interest, but a set-de- 
termined change in inclusiveness, caution, or avoidance of ex- 
treme response positions. There is no way of knowing that 
= a drastic change in response pattern may not be due to 
\POrary moods, or to increased test-wiseness, rather than to 
asic learning. 
The Nature of Response Sets 
ody sponse Sets are a special case of the learned “work meth- 
(13 T nag by R. H. Seashore QU, Jones and Seashore 
Measures argent (20). These writers point out that a test 
а ies! not the subject's ability, but his performance with 
ver methods he uses; a change of technique might change 


hern: 
abi 
у score. Seashore comments: 
In hx Уу - И 
| apasurin ivi i fficient to 
Contro] the ; £ individual differences it is not su 


i € instructi king situati erver’s 
Previous ;. ostructions or working situation, for the obs 


incidental background may lead him to adopt very 


ег 

rm methods than those expected. It follows that 

ig Be limited to ordinary instructions and demonstrations 
Incom 


modify dex pa that other unnoticed factors WI to 
Work ds vork method actually adopted. (21, p. 123.). | 
techn} thods may be temporary sets or may be habitual 
es of performance. 
Sets may also be compared with constant errors 
Pitch has Bysies. In fact, the bias reported in connection with 

€ emp een studied as an error in judgment (31, p. 439). 
dence orar 1n treating this as a response set, is on the evi- 
the error there are consistent individual differences, though 
Stror o n may be “constant” for the individual. Since this 
> it € at least partly overcome by conscious attention to 
dividus] not seem superfluous to introduce the concept of 
ifferences in set, 
nd Cantril (22) describe what we have called re- 
In terms of frames of reference in their recent re- 
Кре, i.c Psychology of attitudes. They review studies by 

th og Sherif, Durkheim and others, all of which con- 
Mine Point that internal conditions of the organism de- 
"ite, “SPonse in any partially unstructured situation. They 


еп à 
© sets 


486 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


.. in the absence of an objective scale (frame) and 
objective standard (reference point) each individual builds 
up a scale of his own and a standard within that scale. The 
range and reference point established by each individual 1s 
peculiar to himself when facing the situation alone. . . . Once 
a scale is established there is a tendency for the individual to 
preserve this scale in subsequent sessions (within a week in 
these experiments). . . . 

. these frames and points of reference are by no means 
always confined to consciously accepted instructions or Jm" 
posed norms but can become established without an individ- 


ual’s realization of it. (22, LII, p. 319; LIII, p. 2-) 
ed is an ар- 


Sherif and 


Essentially the notion of response sets here develop 
plication to testing of the findings reviewed by 
Cantril. 

The crucial question for an understanding of response 
is the extent to which they are transient or fixed. Is an аса” 
escent person equally acquiescent in a history test, a chemistry 
test, and an adjustment inventory? Is the cautious, evasiv 


sets 


person equally so in a grammar test, a personality test, an а 
attitude scale? In experimental studies there seems tO 
| to another 


consistency of frames of reference from one tria 


tempts to compare scores in different types o te 
have shown only negative results, and this would be pc ae 
even if response sets are basic in the personality. For ur i 
sets operate in proportion as a situation is unstructured: * 
the student who finds a psychology test unstructured pana on 
of his ignorance, may be able to answer his chemistry ld 
the basis of knowledge. Unless degree of structuration C set 
be equated for all individuals, correlations of "response 
scores" from test to test are meaningless. tio? 
For the present one cannot decide to what extent 2 ni^ 
such as evasiveness is specific to the immediate test EIE 
at a particular time under particular conditions, t 
tent it would be expected to recur in similar situations, or е 
much it reflects a basic trait that would appear iP aure 
situation permitting evasion, if that situation is unstruc 
Probably all three interpretations are valid. 
Light is thrown on response sets by th 


na 
o whe y 


est 
€ Rorschach * 


— zm d 


RESPONSE SETS 487 


Unlike the usual test, where the content is crucial and the form 
of response is disregarded in the interpretation, the Rorschach 
Interpretation is based almost entirely on the mode of response, 
the work method, or the response set shown by the subject. 
The stimulus is almost completely unstructured, and the sub- 
Ject is allowed to interpret his task as he chooses. The “ap- 


Proach,” or “apperception type” reflects the set of the subject 
to respond 
Well-y 
test a 


nized that temporary anxiety or desire to impress the 
also influence scores. Rorschach results suggest 
any relatively unstructured situation, including “ob- 
tests, the response set of the subject may reflect per- 
аз well as learned habits of response to the particular 
It is interesting to speculate on analogies between 
orschach signs and the response sets in other tests. 
t ee in achievement tests may spring from the same ve 
fon mds to form-accuracy ; evasion in personality and = e 
Or ; nay relate to Rorschach rejection and other with rawing 

"activity indicators. Inclusiveness may compare with the 
Cr of responses in the Rorschach; negativism, the opposite 
SPonsey cé ene, might have its analog in the white а "a 
Potentia] Ь e Projective test. Essay examinations are А on 
Slüber н oo situations, and the response (coe 
Counte lia organized sequence of attack, etc., have thei 

"Parts in the inkblot test. 


ac 


Controlling Response Sets 


ations > More important to control response sets in some € 
"ange of ап in others. Where a test is easy and there is ws s 
dificul ability in the group, response sets have little effect; 
56 7чи ‘tems, or a homogeneous group, permit response sets 
Proy n Breater influence. In some cases, response sets im- 
‘ability and even empirical validity. But response sets 
fineq ower the validity with which one measures the es 
fect the content of the items. Even though the ener 
ве dimin e small, the writer feels that response sets shou 


ated where possible. It is only by identifying and 


488 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


rooting out, one by one, the factors that dilute validity that 


educational and psychological tests can increase their usefulness 
as scientific tools. 

The first step in controlling response sets is to iden 
sets possible in a particular test. The list above includes а 
the significant sets the writer has been able to identify; but 
others may also exist. 

Response sets are reduced by any procedure tha 
the structuration of the test situation. The best pro 
pears to be to adopt an item form which does not invite Te 
sponse sets, wherever that can be done without hampering 
measurement. The multiple-choice pattern appears to be the 
only generally useful form that is free from response sets. 
This form should be adoped wherever the content permits: 
This applies to both achievement tests and to other types- The 
Kuder Preference Record uses this form for measuring interests, 
as contrasted with the Strong blank, which allows several 1€ 
sponse sets. Where the Bernreuter, Bell, Multiphasic, an 
other inventories are open to evasion and acquiescence, Jurgen” 
sen (14) and Viteles (27) recently reported promising attempts 
to obtain more valid answers in personality tests by a multiple 
choice pattern. In attitude tests, experimentation with @ 
multiple-choice form in which the student checks the statement 
he most agrees with, in each group, seems desirable. Other ee 
terns may be modified to eliminate opportunities for respon? 
sets. The writer would reduce the five-choice patter of js d 
Likert-type scale to a two-choice judgment; he would dis e 
the “?,” “Neutral,” and “Indifferent” responses from the ip» 
choice pattern. This may reduce reliability, which in the Qt 
has been increased by the effect of response sets upon the se 
Eisenberg has suggested that personality tests woul 3 b 
less ambiguous by increasing the number of alternatives ae 
item (7, p. 39), but the writer feels that this places stroni 
weight on semantic factors. Woodworth indicates th per 
three-category scale in psychophysical judgment 15 nd 
better nor worse than the two-category scale (31, P- 5), 
favors retention of the neutral judgment in rating scales esh- 
p. 377). His arguments, however, apply to measuring h 
olds and differences between rated objects, not to 


tify the 


t increases 
cedure ар- 


Vw 


\ 


RESPONSE SETS 489 


Problem of studying individual differences between judges. It 
might also improve Likert-type scales to define the alternatives, 
Such as "strongly agree," more objectively, as in the better 
rating scales (23). | i 
irections may be changed to increase structuration. Dif- 
erences in tendency to gamble may be eliminated by directing 
students to respond to every item. Gritten and Johnson (10) 
md the writer favor this suggestion, despite all that has been 
Written against it; encouraging guessing increases the random 
errors of measurement, but it is the only means of eliminating 
the systematic error resulting from response set. In attitude 
tests of the Thurstone type, it might be helpful to direct the 
pudent to check, e.g., the four statements best describing his 
° леб, to eliminate variation in inclusiveness. Changes in the 
test should not be allowed to interfere with the measurement 
tended; it might make a better statistical instrument of the 
Coney Problem Checklist to limit the number of responses, 
ы А make the list less satisfactory for inire and 
amb; elling. Directions indicating what is wanted and defining 
"81005 terms may be particularly profitable in essay tests. 
Phys ар partly successful use of this procedure in psycho- 
“А Judgments is reported by Woodworth (31, р. we 
informe mental attempts to structure the entire ce as y 
ее са Students that just half the items are true, ей 
(C ТШ (6, 4). Each response is a separate » i 
aimed at and attempts to increase structuration mus 
at the individual item. ‚А 
Possib Many economical and desirable test forms, a 
With to remove response sets. Other approaches to m. 
test. un Problem are required. One of the best is increasing 
dent ен of the student. By showing the acquiescent ЫШ 
Parison W many False-marked-True errors he makes, in аца 
him ¢ With True-marked-False, it is often possible to a 
соцра, ome consciously critical in answering questions. ^n- 
Stud теті May overcome overcaution which is causing a 
Vero is receive poorer scores than he deserves. Training to 
to tea, € bias has already been mentioned. It is relatively easy 
"Sponges Ше students what is desired in essay examination 


age 
e 


490 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

If the tester is conscious of response sets, he can examine 
papers to determine what sets may have affected scores. The 
device of the Minnesota Multiphasic, which discards as invalid 
all tests showing excessive evasiveness, might be useful in other 
instruments. 

The basic problem in response sets is that we 
that a test score measures a single variable. Actually, even ! 
item content is homogeneous, item form introduces severa 
variables into a score. The A-U-D pattern for attitude tests 
permits two degrees of freedom in the response to any item. 
The A-a-u-d-D pattern introduces 4 d.f., which might be name 
reaction to content, neutrality or evasiveness, acquiescence ver 
sus negativism, and tendency to go to extremes. The simplest 
approach to this problem, if the response set cannot be elim- 
inated, is to report as many scores for each individual 4$ 
there are degrees of freedom in the test pattern. The scores 
for a person can be successfully interpreted as а profile of 
pattern. This conforms to current organismic attempts to pn 
sider the total behavior in the test situation, as used both in the 
Rorschach test, and the relatively structured tests 1.3, 42, pm 
8.2 of the Progressive Education Association. 1 statistic? 
treatment is to be made, it is important to retain all degrees 5 
freedom. Attempts to reduce L-J-D percentages to a SIDE 
score always discard information. By a choice of two scores 0 


are assuming 


functions of scores (L-D, I L p etc.), meaningful relations may 
+ > 
be made clearer. In an interest test of the L-J-D tyP® Ium 


and the writer found that the most meaningful picture we кб 
be obtained by plotting scores іп а two-space with three ho 

geneous coordinates. Statistical methods for such a space б) 
be devised which permit considering all variables at once, бе 

Опе final suggestion is to weight responses 50 that in f 

majority of cases the response set increases validity- Since f ts 
majority of persons, when in doubt, tend to judge statem" s 
true, doubt may be penalized by counting false responses ue 
heavily, or by loading the test with a majority of false 512 ut 
ments. This increases the validity of scores, on the whole, ‘cal 
gives a spuriously high score to the occasional highly EH | gi 
individual. This practice underlies such weighted scorn 


т 


RESPONSE SETS 491 


used by Strong and Bernreuter, which makes their tests reliable 
on the whole, however invalid they may be for a person with an 
atypical set. In such sets as gambling versus caution, or speed 
versus accuracy, the score is normally weighted to favor one 
Particular set. 

Summary and Conclusions 


1. Response sets are defined as any tendency causing a 
nt ee to make different responses to test items 
is Ps ould have made had the same content been presented 

erent form. 
ve Evidence is presented, or cited from other studies, to 
Dstrate the existence of these response sets: 
a) Tendency to gamble; caution versus incaution. This 
15 found in usual objective examinations, and appears as 
€vasiveness in personality, interest, and attitude tests. 
b) Definition of judgment categories. Individuals dif- 
fer in the meaning assigned to, and the frequency of use 
of, alternatives offered in attitude and personality scales. 
c) Inclusiveness, the tendency to give many responses 
Where the number of statements to be checked, or the 
like, is unspecified. This appears in certain tests of 
| d) gn attitudes, adjustment, etc. 
2 las; acquiescence. This appears in true-false tests, 
i discrimination tests, and some attitude, personality, and 
Interest tests, 
n3 Speed versus accuracy. 
Miscellaneous response sets on essay tests, related 

" to style of response. | 
24, ndividual differences in response sets are reliable. 

ч ан sets have the greatest influence on performance 
Ous or unstructured situations. 

n taise op UM Sets may raise or lower reliability, and may 
| Bu Mating validity as measured by correlations with criteria. 
Ral ees a they permit persons with equal knowledge, identi- 
d 5, or equal amounts of a personality trait to receive 
Response 200769 response sets always reduce logical validity. 
‘cul Sets interfere with interpreting test data to reveal 

t q item content, or growth as a result of training. 
uncertain whether response sets are specific to a 


itude 
Erent 


492 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


given type of situation, or whether the student who is acqui- 
escent on an achievement test would also be acquiescent in 
personality and attitude tests if they were equally unstructured 
for him. Temporary variations in mood or set may influence 
performance, but retest studies show stability in response sets. 

7. The following suggestions are made for eliminating the 
effect of response sets upon test validity: 

a) The multiple-choice form, which appears free from 
Tesponse sets, should be used wherever possible. 

b) The test pattern should be made less ambiguous, bY 
reducing the number of alternatives for a judgment an 
eliminating the neutral response. Alternatives in Likert- 
type scales should be objectively defined. : 

c) Directions should be changed to eliminate variations 
їп response set. Directions to respond when in doubt 
are recommended. 

d) The test-wiseness of the student may be increased by 
an explanation regarding his response sets. 

€) Scores of persons revealing strong response sets may 
be discarded. 

f) Because most item forms permit more than on 
gree of freedom in the response, methods of retaining ? 
of the information are needed. Interpretation of pro" 
files or patterns of scores is desirable. Statistical met?” 
ods for handling two scores at once are referred to- . 
g) Scores may be weighted so that response tendencies 
which correlate with lack of knowledge in the majority 
of cases are penalized. 

There are many points in the response-se s 
Supported by direct evidence, but it appears that suffic 
evidence is available to prove that a real effect is present: 
may seem that the points raised are trivial, in view of the B^ 
Service rendered in the past by personality, interest, artun- 
and achievement tests where sets are permitted to in 
scores. Yet recognition and control of such irrelevant f 
are precisely the improvements needed to raise mental meas 
ment from its present imperfect level. the 

Further research, including experimental validation of Jf 
suggestions for controlling sets here offered, is called for. 


e de- 


: not 
hypothesis 
ae jent 


yet. 


fluenc® 
5 

actor ^ 

ure e 


— 


eu 


Жи ГЕ ЁР є: i. №. 


RESPONSE SETS 493 


methods of study can be found, knowledge is required regarding 
the nature and origin of individual differences in response sets. 

€ only sound procedure for controlling structuration, to 
study response sets unaffected by the person’s knowledge about 
item content, is to use nonsense items where no one has a 
knowledge of the content. This is difficult to use on any large 
Scale with the retention of normal test attitudes. 


REFERENCES 
Campbell, A. А. “Two Problems in the Use of the Open Ques- 
tion.” Journal of Abnormal and Social Psychology, XL 
2 (1945), 340-343, А : 
Cronbach, L. J. “An Experimental Comparison of the Multiple 
True-False and Multiple Multiple-Choice Tests." Journal 
3 of Educational Psychology, XXXII (1941), 533—543. 
` Cronbach, L.J. “Exploring the Wartime Morale of High School 
4 Youth.” ` Applied Psychology Monographs, I (1943), No. 1. 
d Cronbach, L. J. "Studies of Acquiescence as a Factor in the 
rue-False Test.” Journal of Educational Psychology, 
5 XXXIII (1942), 401-415. 
n Cronbach, L. J. “The True-False Test: a Reply to Count 
6 D Etoxinod.? Education, LXII (1941), 59-61. 
` unlap, J. W., DeMello, A., and Cureton, Е. Е. “The Effects of 
ifferent Directions and Scoring Methods on the Reliability 
9f a True-False Test." School and Society, XXX (1929), 
7. p; 278-382. 
` “tsenberg, p, “Individual Interpretation of Psychometric In- 
ventory Items.” Journal of General Psychology, XXV 
F (1941), 19-40. à 
` “ernberger, S, W, “The Use of Equality Judgments in Psycho- 
Physical Procedures.” Psychological Review, XXXVII 
9. G; (1930), 106-112. В 
. ilmour, W. A. and Gray, D. E. “Guessing on True-False 
. Tests.” Educational Research Bulletin, XXI (1942), 9-12. 
Gritten, F. and Johnson, D. M. “Individual Differences in 
udging Multiple-Choice Questions.” Journal of Educa- 
п. Gu; tional Psychology, XXXII (1941), 423-430. 
uilford, J: “The Difficulty of a Test and Its Factor Com- 
12, Hal Position,” Psychometrika, VI (1941), 67-77. 
all, W, Б. and Robinson, Е. Р. “An Analytical Approach to 
the Study of Reading Skills." Journal of Educational Psy- 


1. 


ЭЗ. onec" ology, XXXVI (1945), 429-442. 


S, H. È. and Seashore, R. H. “The Development of Fine 
9tor and Mechanical Abilities.” Adolescence, 43rd Year- 
ook of the National Society for the Study of Education. 

Edited by N. B. Henry. Chicago: University of Chicago 
Tess, 1944, pp. 123-145. 


494 
14. 


15. 
16. 
17. 
18. 


19. 


20. 
21. 


22. 


23. 


24. 
25. 
26. 


27. 


28. Wh 


29. 


30. 
$T; 
32. 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


urgensen, C. E. "Report on the ‘Classification Inventory,, 2 
J Personality Test = Industrial Use.” Journal of Applied 
Psychology, XXVIII (1944), 445-460. adf 
Lentz, T. F. *Acquiescence as a Factor in the Меавигепү 9 
Personality.” Psychological Bulletin, XXXV (193 ), for 
Livesay, N. and Cronbach, L. J. “Statistical Methods 
Closed Systems.” Unpublished. Е" 1 of 
Mosier, С.І. “A Psychometric Study of Meaning. Journat 
Social Psychology, XIII (1941), 123-140. ГРУ 
Osgood, С. Е. “Ease of Individual Judgment-Processes in e 
lation to Polarization of y) b" Још 
of Social Psychology, XIV (1941), 403-419... " 
Rundquist, E. А? “Form of Statement in Personality Mei), 
ment.” Journal of Educational Psychology, XXXI ( 
135-147. | н. 
Sargent, S. S. “How Shall We Study Individual Differen 
Psychological Review, XLIX (1942), 170-181. d Factor 
Seashore, R. H. “Work Methods: an Often Neglecte Reviews 
Underlying Individual Differences.” Psychologica 
XLVI (1939), 123-141. 
Sherif, M. and Cantril, Hadley. “The Psychology ae 
уар Review, LII (1945), 295-319; L 
-24 


itudes." 
1 (1946), 


in Terms In- 


Simpson, R. H. “The Specific Meanings of Сегсат Quarterly 


dicating Different Degrees of Frequency- 
_ Journal of Speech, XXX (1944), 328-330. ding Ste 
Smith, E. В. and Tyler, R. W. Appraising and Recording 
‚ dent Progress. N. Y.: Harper, 1942. 550 pp- rnal of 
Swineford, Е. “Analysis of a Personality Trait.” | m 
Educational Psychology, XXXII (1941), 438-4 M ait.” 
Swineford, F. *The Measurement of a Personal 38), 2 й 
Journal of Educational Psychology, male ПИА 
300. Research, 
Viteles, M. S. “The Aircraft Pilot: Five Years ol tin 
A Summary of Outcomes." Psychological Bou 
(1945), 489-526, jet Te 
erry, R. J. and Gaylord, R. H. "Factor rue tion Со” 
Items and Tests as a Function of the Corre? 
‚ efficient.” Psychometrika, IX (1944), 237-244. tive Tess 
Wiley, L. №. and Trimble, O. С. “The Ordinary Objec Traits: 
as a Possible Criterion of Certain Persona ny 
School and Society, XLIII (1936), 446-448. sj urnal of 
Wood, B. D. “Studies of Achievement Tests, S 
Educational Psychology, XVII (1926), 1-22. Y. Holts 
Woodworth, К. S. Experimental Psychology. №. 1. 
1938. 889 рр. навай 
Wyatt, R. Е. *Improvability of Pitch Discrimination И" 
chological Monographs, LVIII (1945), No. 2. 


Psy 


"та 


THE EFFECT ON A CANDIDATE'S SCORE OF 
REPEATING THE SCHOLASTIC APTITUDE 
TEST OF THE COLLEGE ENTRANCE 
EXAMINATION BOARD 


RUTH C. STALNAKER ax» JOHN M. STALNAKER 
Stanford University 
| Tue Scholastic Aptitude Test of the College Entrance 
rte Board is a reliable test of verbal ability? which 
20,000 c. = four times a year and is taken each year by over 
о idates for admission to selected colleges. Scores are 
Poire p scale which has a mean of 500 and a standard 
Yâr, С : 100. New forms of the test are prepared each 
Parallel ‘i the basis of a certain amount of common material in 
Wien à orms of the test, the scores are equated from year to 
бн ган score will not have one interpretation if the April 
Used in vs test is taken and a different meaning if the form 
average rae is considered. That is, 500 represents the 
necessari]. j ы of the “normal” Board population, but not 
the end the average score of a group taking a given form 
лус at any one session. | | 

to enter igi take the test in the spring before they plan 
Junior ye ege, some take it a year earlier—at the end of their 
oth iuc. in secondary school—and a small number take it at 
ect on ү 1 he question naturally arises, therefore, as to the 

It à mene idare s score of his having taken the test before. 
in ate takes the test twice—with usually a year's time 


teryen; 

e a 2 4 : 

first? hing—is his score higher the second time than it was the 
? F а s s 

Crease urthermore, if it is higher the second time, is the in- 


In takina, to the fact that the candidate has had some practice 
8 the test, or to the fact that he has grown in the ability 


1 
Whig, Curr oe 
H ent " . . à 
cha separa ions of the test also contain a section on mathematical aptitude, on 
Score is reported, but this discussion is limited to the verbal section. 


495 


496 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


being measured? And what of the candidate who takes the test 
only once, but in his junior year? His score is being compared 
with other candidates who have taken the test in their senior 
year. Can the scores be compared directly, or must som? ad- 
justment be made to compensate for the fact that some candi- 
dates took the test as "preliminary" candidates, that is one 
year from college, and some as “final” candidates, or shortly 
before entering college? It was in an attempt to find at least 
a partial answer to these questions that the study here reporte 
was undertaken. 

For a number of years, a small group of about 800 candi- 
dates have repeated the Scholastic Aptitude Test one yea! after 
they had first taken it. These candidates were usually foun 
to gain about 60 points (.6 standard deviation) on the average 
upon repeating the test. However, most of this group were 
asked to repeat the test because they had received low scores: 
Their average score on the first test was about 440 or .6 sta? а 
ard deviation below the average of the normal group: Ont 
might conclude that 60 points should, therefore, be added to = 
preliminary candidate’s score to give the score he would гесег/® 
а d later. This procedure is not justified. low 
fe a sub-group such as this one has an average score be T 
the mean of the total group of which it is a selected samP р 
it has been found that upon immediate repetition of са ps 
VN а TW Dei de i RE pei cf 

plain if one ne 
each score represents the sum of a candidate’s true score i 
which exactly represents his ability) and an error score, ne 
may be positive or negative. For candidates scoring elow 
average, the error score is apt to be negative. Error scores К 
assumed to be unrelated, so upon repetition of de Je 5 
group scoring low will tend to shift their scores toward the we 
of the total group. With a test having a reliability (test-r*f an 
type) of .94 for the total population, a sub-group with à ms 
of 440 on the first form might be expected to average 444 en 
second form, taken without any significant lapse of time: 

In 1942, because of large-scale changes in the admi? 
procedures of most of the colleges making greatest use 


mes t at 


5100 
the 


"OP. 


REPEATING SCHOLASTIC APTITUDE TESTS 497 


dn tests, all candidates seeking admission to Board col- 
б s I asked to take the Scholastic Aptitude Test in April, 
sat. they had taken it previously. As a result, about 2,000 
2 = ates who had taken the test in June 1941, took the form 
е given їп April 1942. This group, being fairly “nor- 
ам in ability, with a mean of 511 and a standard deviation 
Bn on the 1941 test, provided the data for a study of the effect 
To ung the test. The group consisted of 1604 girls and 
p а The majority of the candidates, both boys and girls, 
Ше dew independent schools, but 347 of the girls and 126 of 
ys were from public schools. The proportions of boys 


TABLE 1 


AC А 
omparison of the Mean Scores Obtained on Two Forms of the Scholastic Apti- 
tude Test by 2000 Candidates Who Took Both Forms* 


4 Standard 
Number Mean Scores Deviations Cine: 
ien = — lation 
В 1941 1942 Gain 1941 1942 Gain 
Oys: 
PupdenendentSchoo .... 270 496 $8 47 10 100 2 96 
ONG us aope 126 510 561 51 95 94 35 93 


396 501 548 47 101 100 32 95 


P 319 571 52 96 91 33 94 
IL Schools 495 350 55 91 91 34 93 
ДО ema 514 566 52 95 92 33 94 


Candi 
ndidates S11 563 52 9% 94 33 94 


* Th 
e s е 
Standard нера are converted scores оп а scale which has а mean of 500 and a 
rms of th ion of 100 for the normal Board population. The scores on the two 
е test are equated. 


a + 
гаан not typical of the proportions in the total Board 
schools 2 * nor are the proportions from the two types of 
Policy af : he preponderance of girls in this group 18 due to the 
to take » € colleges for women of encouraging their candidates 
other ¢ e Scholastic Aptitude Test in their junior year and 
€sts in their senior year. 
able 1 shows that the average score of this group of 2,000 


Candid 

ates і : 

after ET increased 52 points when they repeated the test 
m interval of ten months. The girls’ scores increased 


Ore 
~Z than those of the boys, although the difference is slight. 


ay 
t n Apri 
ae Were pah DAR for example, 50 per cent of the 16,626 candidates who took the 

cent of public schools and 50 per cent from independent schools. Fifty-eight 


€ total group were boys; 42 per cent were girls. 


498 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The candidates from public schools gained more than those 
from independent schools, but here again the difference is small. 
The differences in sex and in type of school seem to have little 
effect on the amount of increase one may expect in a candi- 
date’s score. 

The question next arises as to whether candidates at all 
levels of ability increase their scores to the same extent. Table 
2 shows the average scores obtained by candidates classified 


TABLE 2 ч ü 
A Comparison of the Mean Scores Obtained on Two Forms of the Scholastic Арі 
tude Test by Candidates Classified According to their Scores 
on the First Form 


Standard 
Number Mean Scores Deviations Corre- 


of e deca RIO. 
Cases 1941 1942 Gain 1941 1942 Gain 


Boys 600 and higher ...... з 650 685 35 36 40 26 77 
500-599 Жаннан жн 12 547 595 в 28 4 35 57 
400-499 | lun 128 43 504 51 28 43 36 7 

сы, О" Ж em з 36 41 43 335 49 B l 

irls 600 and higher ,...... 321 655 691 36 42 42 26 Ge 
po pone 521 546 595 49 29 41 20 0 
DE чыш 571 455 516 61 28 46 36 561 
Below 400 1:11:77 191 358 49 71 26 43 395 o 

Total 600 and higher ....... 394 654 690 36 41 41 26 $5 
PER ee 643 547 505 48 29 41 35 gi 
TUB, анттанат s 699 454 514 60 28 45 36 (6 

elow 400 1.277 264 366 424 58 29 4 %2 ` 
n " i Г 

according to their scores on the first test. According to x 

the group scoring highest on the first test should show the tra- 

gain and the group scoring the lowest on the first admin's 


tion, the highest gain. Actually, candidates who are pa 
average on the first test do raise their scores considerably ™ 
rhan do those who are above average the first time. Candida o 
scoring above 600 the first time averaged 36 points higher eon 
repeating the test; those scoring from 500-599 raised their Wr 
age 48 points. The group scoring from 400-499 increas nts: 
average 60 points, and the lowest group (below 400), 58 po! 


. B . i 1 нй 
There is little difference between boys and girls E са 007 
groups scoring above average. However, the boys 11 in the 


499 range averaged an increase of 51 points; the girls 
same range an increase of 61 points. In the lowest group» 
boys increased 43 points, the girls 71 points. 


l————————1 рш 


b 


REPEATING SCHOLASTIC APTITUDE TESTS 499 


In order to determine whether or not candidates increase 
their scores on one type of item to a greater extent than on 
another, comparisons were made between the scores on each of 
the three subtests in the two forms. The 100 items of Subtest 
~ne each consist of four adjectives, from which the candidate 
15 asked to select the two which are most nearly opposite in 
Meaning. The fifty items in subtest two present a pair of 
related Words; the candidate is asked to select from a given list 
°F words the pair which represents a relationship most nearly 


TABLE 3 


Я Comparison of the Mean Scores Obtained on the Subtests of Тин ud of the 
cholastic Aptitude Test by 2000 Candidates Who Took Both For 


Standard Correlation 
Mean Deviation 
1941 Anton 51.18 | 9.78 
yms s . 89 
1942 ntonyms 5637 2 
ат... * А * 
1941 Analogi ў 9.65 
gies 50.84 .82 
1942 Analogies 5547 са Ё 
зап ...,,, 4.6. . 
14 Para OR 945 
agraphs ... 51.29 d 
1942 aragraphs ES Hd e 
RID. uus. 3.9 b 
DA Total Test 26 
TD, 511 
ud b ылы" ga 24 ii 
Leonie 52 " 


ard d a mean of 50 and a 
standard yee on the subtests are standard scores based on н eR 
Эй ma of al standard Board population. he рей 

based опа mean et ee ee of 100 for the same population 
тт i ists 
of и to that of the given words. The third кү = 
Chan лош Paragraphs in each of which one = са 
that ed то spoil the meaning; the candidate 1s as 5 (dae 
аа All items in the test have been pede er in 
the tora) Vhich has a bi-serial coefficient of lower than . des. 
related 1 Score. The scores on the three subtests are highly 

e 
to one a з 

nother. 

| 1 these 

SUbtests ; shows the mean scores obtained on eo of = 
took = 11 1941 and in 1942 by the group of candi noe 
nek forms, In order to make the scores Casa e : y 

= been reduce 
m the other, all subtest scores have 


With ш 1 


able 


i nd .80 
; 1942, for example, subtest 1 correlated .79 viti subtest 2 a 
> Subtests 2 and 3 correlated .78 with one another. 


500 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


standard scores with a mean of 50 and a standard deviation of 
10 for the standard group. From this table it can be shown 
that the largest gain was made in subtest one (Antonyms), 
the next highest in subtest two (Analogies) and the smallest 
gain in subtest three (Paragraphs). The order is the same for 
all groups. 
_ Having established the fact that candidates do in general 
increase their scores considerably upon repeating the Scholastic 
Aptitude Test, we still have no evidence as to the proportion 
of the change which is due to growth in the verbal factor an 
the proportion due to “practice,” or familiarity with the types 
of items. This question is of practical significance, евресіа y 
for estimating the score which a candidate who takes the test 
Some time before entering college might be expected to make 
if he had waited a year. However, it is apparent that there T 
no easy way of determining exactly how much of a given 117 
crease is due to practice and how much to growth when there 
is an interval of approximately a year between the two t 
One would expect that the shorter the interval the greate! "E 
effect of practice, and the less the effect of growth. Therefore? 
Ч the effect of practice could be determined when the inter? 
is very short, it should be safe to conclude that it is РГО? 
no greater when the interval is longer. 
ee attempt is made to equalize the effect of 
larity with item types by sending to all can 
advance of the test a practice booklet. This booklet con 
from ten to fifteen items of each type included in the test; "ui 
complete instructions for answering each kind of item- jj 
when a candidate arrives at the examination room, hie pe. n 
understand th A : ne f uesti? 
h the problem involved in each kind o! d 
wi Me or not he has taken the test before. forms 
of ee it is hardly feasible to administer V er t? 
d у est to a large group at one sitting ! ata 
etermine the maximum effect of “practice.” HoweVeb on 
are available on several groups of candidates who have ГА 
two different forms of subtest one (100 Antonyms баа а h- 
single sitting. In December, 1940, a group of 141 college gud? 
men—all of whom had taken the complete Scholastic A?" 


practice on 


didates 10 
tains 


NW 


REPEATING SCHOLASTIC APTITUDE TESTS 501 


Pe dia TRE of 1940—took at one session two parallel 
Dueh i. ntonyms subtest. Thirty minutes were allowed 
E. rm 8 same amount of time allowed for subtest 
two grou ae ar test). The 141 candidates were divided into 
their D o approximately equal ability as determined by 
Bes in > on the Scholastic Aptitude Test. (The 71 candi- 

of 84; i ar 1 had a mean of 574 and a standard deviation 
standard de candidates in group 2 had a mean of 574 and a 
orm A fires т ваа of 88.) Group 1 took the Antonyms test 
two forme ч ollowed immediately by form В; group 2 took the 
n reverse order. The results are given in Table 4. 


The Mean TABLE 4 
s — 
and Standard Deviations of the Scores Obtained on Two Antonyms 
Tests Taken at One Session 


Group 1* Group 2* 


Ni 
Umber of Candidates 


co ША var чиен Ser renee атас ае 71 70 
© on Aptitude Test М erore aori 574 574 
Scores on F Standard Deviation .. 84 88 

orm A Antonyms MESI: осла муса 5898 5936 

Scores on р, Standard Deviation .. 7,63 7.83 

| orm B Antonyms Mean «ee 5881 5913 

Gain on , Standard Deviation .. 898 8.51 

КООРДИ бин ер; occu secat eate sisi _17 23 


form Ot 
0j ou 
"m В first, ub 


ach 
"gardes, received a slightly higher average score on form A, 
Score, OR 9! whether they took that form first or second. Their 
Sheep the second form of the test reflected no “practice 

A wh atsoever, 
o 

терезе aT groups—part of the large group of 2,000 1941-42 
» immed; took a second Antonyms subtest (form A or form 
tns i 2чу following the regular Scholastic Aptitude Test 
si тев Antony. 1941. For each of these two groups, therefore, 
А » the jf Scores are available, the first two taken at one 
тапа f us third ten months later. Each of these groups is a 
Sores a mple of the larger group. Table 5 shows the mean 
Ane as Z each of these groups on the three Antonyms 
"опу a ӨЙ the two complete tests. The first and third 
Subtests taken are the same for both of these groups 


ok f. i ; s E 
wad pr n followed immediately by form B; Group 2 took 


502 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 5 


The Mean Scores Obtained on Three Antonyms Subtests by Two Groups 
of Candidates* 


2 
Group 1 Group 
= 216) 
(n= 213) (n sialon 
First Antonyms test 50.81 pd 
Second Antonyms test 51.02 5237 
Third Antonyms test 55.93 2629 
First total score 505 5 
econd total score 557 aa 
Gain from first to second antonyms score 21 6.52 
Gain from first to third antonyms score 5.12 58 
ain in total score 52 сыш с^” 
the first 


* The first and third antonyms subtests are the same for both groups April 
for each of En 
two groups, but was taken by both groups immediately following the pes 
form of the Aptitude Test. 

(that is, part of the regular tests given in June 1941 ad Ара 
1942); the second subtest (taken immediately following сав 
first) was in two forms, A and B, each group taking as on 
form. Each of these groups received a slightly higher scor 
the second Antonyms Test—group one making а gan 0 (0 
(or .02 standard deviation) and group two a gain o att 
standard deviation). These gains are so small as to 10 үсеп 
that the effect of practice, when there is no interval bet ms 
the two forms, is slight. To the extent that the Дишон 
subtest is typical of the total test, this same conclusion T" 


drawn regarding the test as a whole. 


Conclusions 


The data presented here lead to the following co? 
regarding increases in score оп the Scholastic Apt" | 
when the second form of the test is taken approximate ү 
year after the first: ider?” 

1. Candidates may be expected to receive scores verag? 
bly higher the second time they take the test. he i e 
increase of all candidates in this group was ae E s 
standard deviation of the increase was 33. That is, Lai d 8 
of the candidates increased their scores between 
points. 


clusion 
est 


one 


као 


OS a, 


> 


REPEATING SCHOLASTIC APTITUDE TESTS 503 


2. Differences in type of school and in sex seem to have 
le effect on the amount of increase. 

3. Candidates scoring below average the first time they take 
the test make larger increases, on the average, than candidates 
Scoring above average. Candidates scoring above 600 the first 
time make the smallest increases, although even this group 
averaged an increase of 36 points. 

4. Since the effect of practice in taking the test appears to 

€ slight with no time elapsing between the two tests, it is 
reasonable to conclude that practically all of the increase in a 
Candidate's score is due to growth in the verbal factor, and not 
to increased familiarity with the type of test. 
Candidates who take the test as juniors (eleventh grade) 
may be expected to score lower than they would if they waited 
* Year longer before taking the test. ; 
€petition of the test gives, on the average, no special 
8¢ over taking the test only once in the senior year. 


litt 


advanta 


THE MODIFICATION-REVISION METHOD IN 
PSYCHOMOTOR MEASUREMENT: 


JOSEPH E. KING, Jr. 
Science Research Associates 


Ty aide isa summary of an investigation on the modifi- 
see ision method. in psychomotor measurement. The 
Bini. -n out in 1944 at Medical and Psychological 
анон а nit No. 10, an installation of the Army Air Forces 
sychology Program (4, 5). 


The Modification-Revision Method 


surent Modification-revision method in psychomotor mea- 
а ймы developed basically as a technique for securing 
tests, i of performance from existing apparatus 
test involy; example, the subject in operating a psychomotor 
exerted ime manipulation of a stick similar to that of a plane 
esigned 117198 degrees of grip pressure. А modificato NS 
ur oe a measure of this hand pressure during the 
Svelope tio On the basic test. Similarly, the revision was 
“Xisting 5 Secure an additional measure of performance from 
Problem PParatus tests by increasing the complexity of the 
Solve 4 es the basic test. The subject might be required to 
rest, t к problem simultaneously with that of the basic 
Might be viding his attention between two stimuli; or he 
tS sti u required to solve the problem of the basic test when 
584 а Situation had been altered. The basic tests em- 
$ p E Study were the Complex C ordination Test and 
r the ea г. Test used in the Aviation Psychology Pro- 

With; e а анар of aircrew candidates (4). H 
Soy; à test,» ation measure may thus be described as a "test 
Ving the 4. It was postulated that while the subject was 
2: Problem of the basic test, he was employing skills 


is stud 
У was conducted as part of the AAF Aviation Psychology Program. 
505 | 


506 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


that were not being measured by the basic test score. The 
hypothesis underlying the development of modifications was 
based upon previous civilian (3) and military (8, 9) researc 
on the measurement of visceral and muscular behaviors accom- 
panying the solution of a problem situation. Modification 
measures were developed to sample such secondary behaviors 
as hand and leg tension and motility in the operation of the 
controls of the basic test, precision and steadiness in the control 
operation, and reaction to a pattern configuration by the move 
ment of the controls. 

The revision measure may similarly be described as à E s 
upon a test" It was postulated that the addition of a further 
problem to be solved simultaneously with that of the basic test 
would increase the complexity of the function that the basic test 
was measuring. The hypothesis underlying the development 
of revisions was based upon previous civilian (1) and military 
(4, 7) research on the measurement of reaction to а stress 9! к 
ation. In this study there were developed such second pro 
lems, in addition to the basic task, as throttle ma 
counteraction of external control pressures, and target 515 est 
and such changes in the stimulus situation of the basic “ee 
as auditory rather than visual presentation, а moving p 
than a stationary target, simultaneous rather than discret? c ht 
trol movement, and memory rather than perception 0 
positions. tore 
: Development of modifications and revisions was consid 
important for three reasons: (1) The modification-re¥ tus: 
method could effect economy of testing time ап app? Jem 
(2) Proper selection of the secondary and additional Pr стей 
might add significantly to the validity of methods °! $ and 
selection. (3) Previous studies of concomitant behav!’ she 
stress measurement had been shown efficient in те 
emotional components of problem solution, and suc 
ment was particularly applicable to aircrew selection: 


situ- 


" А " istons 
Construction of the Modifications and Revis the 


А jon 0 
Two standards were employed іп the constructio ith de 


modifications and revisions. These were concerne 


MODIFICATION-REVISION METHOD 507 


behavioral function to be measured and the routine mechanics 
of presenting the test problem to the subject. It was recog- 
nized that a later statistical analysis would verify adherence 
to the proper standards of construction. 

. The functions measured and the types of problem employed 
in the construction of the modifications and revisions were 
selected from two sources. Where possible, the performance 
to be extracted from or added to the basic test was a function 
already shown to be valid per se for the prediction of aircrew 
Success. In selecting problems where no previous research was 
available, the face validity of the function as indicated in the 
Job analyses of aircrew performance was required. In the 
choice of function, an attempt was made to select those prob- 
lems which would be statistically independent of current pre- 
dictive instruments. 

In building the apparatus and presenting the test problem 
to the subject, care was taken to eliminate any variables which 
might affect the normality, objectivity, or consistency of the 
measurement, 


1 м . 
"es Hand-Pressure Modification ol the Complex Coordina- 
mudi serve as an illustration of the construction of е 
ina oe The Complex Coordination Test was describe 
instru cent article on AAF psychomotor tests (8). This yar 
Regatta ir had been designed to measure serial hand- 7 
Plane nation in the operation of the stick and rudder m ofa 
lights Vie required the subject to match three movab e на 
three controlled by a stick and rudder bar) with patterns o 

Stationary red lights. The Hand-Pressure Modification 

(3) се preceded in civilian literature by the Luria studies 
dos in aviation psychology literature by research in the 
(7) Program (9) and at Psychological Research Unit No. 1 
ary, 1 в Hand-Pressure Modification was proposed in Janu- 
Methods. and suggested three major improvements of i 
of the 5: (1) The score was to be obtained by clock recording 
cme length that grip pressure was maintained above a 
ating а _ (2) The hand to be measured was the one oe 
Muscu] € stick of the apparatus, thus affording a measure o 
аг tension in a voluntary act where the movement 


Riven 


508 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


measured was an integral part of the response being studied. 
(3) The subject was unaware that a secondary score was being 
obtained in that the stick of the modified Complex Coordina- 
tion Test was not visibly different from the stick of an unmodi- 
fied apparatus. 

The aircrew correlate to be predicted by the Hand-Pressure 
Modification was defined as the tension on the stick in 
operation of the plane. The modification was aimed primarily 
at the job of the pilot, but analyses of the duties of the navi- 
gator and bombardier as well had emphasized the need for 
absence of tension, confusion and nervousness, and fear and 
apprehension in aircrew performance. In view of the similarity 
of the basic Complex Coordination apparatus to the instru- 
ments used in the flying situation, the measurement of stick 
tension in this apparatus possessed high face validity for the 
prediction of similar stick tension in the piloting situation. . 

h The Hand-Pressure Modification was constructed by modi- 
fying the stick of the Complex Coordination Test. es 
contact points were inserted inside the grip section of the 886 4 
When hand pressure on the stick reached a given amount, thes 
contact points closed and activated a standard electric timer. 
As long as the hand pressure continued, the clock score 2 
recorded. When pressure on the stick was relaxed, electric?” 
contact was broken and the clock stopped. A high clock me 
thus indicated an excessive amount of hand pressure А 
on the stick during the operation of the basic test; а lo% wu 
a minimal amount of tension. The hand-pressure score W 
collected during the eight-minute administration of the 
plex Coordination Test. "n" 

The Throttle-Control Revision of the Complex C cordi" а 
tion Test will serve as an illustration of the construction is 
revision. The Throttle-Control Revision was designed to ask 
sure the division of attention by the addition of a pursuit © 
requiring simultaneous solution with the Complex Coo 
Test problem. The aircrew function to be predicte 


py the 


ci gâ . : ашон , 
revision was defined as divided attention in the ope"? pait 


the controls of a plane. Actually, the division of attention 


à s "e e 
was a requirement for all aircrew positions, but appear 


the - 


dination 


И Eee Oe 
Sila _— е ————„ — 


MODIFICATION-REVISION METHOD 509 


Ше job analyses and from other research studies to be a spe- 
cific factor. The Throttle-Control Revision of the Complex 
Coordination Test placed emphasis upon an aspect of this 
ability required for pilot performance. Face validity was 
achieved in this revision by duplicating the situation of divided 
attention in piloting a plane, and by the use of throttle manipu- 
lation as the simultaneous problem to be solved. Previous 
research on pursuit problems had shown the validity of this 
аи at problem in itself (8). The combination of the pursuit 
unction with the proven coordination task held the possibility 
of both cumulative validity and of sampling the division-of- 
attention trait. ` 
the m revision apparatus was constructed on the principle of 
ace di li pursuitmeter. It employed an ammeter, ad 
Needle fs airspeed indicator superimposed on the dial. e 
the dbi the ammeter was mechanico-electrically moved across 
matie q; acein a random forward-backward pattern by an n 
ace, а ran unit. A tolerance area was marked on the = 
їп ed €xcursion of the needle outside these limits resu ja 
Oordin Isappearance of the stimulus lights of the pid ex 
е leye ation Test, Forward-backward movements of a throt- 
теи acted as a rheostat and resulted in the increase or 
needle : a the airspeed and thus the control of the TE 
tween d he task of the subject was to divide his RA: x 
0074; © matching of the red-green lights of the : Ke pie 
Contro] ation Test by the operation of the stick and ru T 
e s S, and the maintenance of the airspeed indicator within 
ae tolerance limits by the fewer расна E 
Ја терр n throttle lever. Failure to pu ж jns 
of the sti the airspeed indicator resulted in t is Sen ә 
be suff mulus lights. Observation showed both m i 
tention 7 simple to allow their pécwper ie ge 
ficient + to both tasks to be the only metho JE dec ш 
Patre, Performance, The score engl’ n ae while 
Sim к е Complex бирне а, | pen 
as jug = adjusting the airspeed indicator. p 
ged in four continuous trials of two minutes eac 


510 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Analysis of the Modifications and Revisions 


The modifications and revisions were analyzed with refer- 
ence to a series of statistical requirements to verify their ad- 
herence to the criteria of test construction. In view of the fact 
that this study was conducted at a Medical and Psychological 
Examining Unit (4), emphasis was placed upon the so-called 
pre-validation standards of analysis. The three pre-validation 
enema employed by the writer required adherence of the modi- 
fication and revision measures to standards of distribution, 
reliability, and independence. If further study were warranted, 
AAF Training Command Headquarters (6) assigned the mea- 
sure a validation priority at the Department of Psychology; 
School of Aviation Medicine (8). 

In terms of distribution, normality in the arrangemen 
test scores was required. For consistency of measurement, an 
odd-even reliability of at least .75, uncorrected for length, was 
postulated. The criterion of independence was met if the modi- 
fication or revision correlated below .60 with the basic test, 27 
below .40 with the test and stanine scores of the Aviation Psy- 
chology Classification Battery (6).° The fourth analysis stan“ 
ard of validity was that employed throughout the Aviation 
Psychology Program (6). 

Analysis of the Hand-Pressure Modification scores of three 
hundred aviation cadets showed this modification to afford а 
normal distribution of scores, high reliability of measurement 
(.89), and little relationship with the current AAF Complex 
Coordination Test (.10), stanine scores (— .03 to .07), oF Class" 
fication Test Battery (-.08 to .08). On the basis of the ps 
validation analysis, the Hand-Pressure Modification could m? 

a contribution to the multiple correlation of the AAF Battery 
with pilot proficiency if it attained a minimum validity E 
3. In a validation study of this modification carried out ^ 


t of 


А 1 
_ 2 The Aviation Psychology Classification Battery consisted of approximate д 
printed and 6 psychomotor tests, which measured verbal, mechanical, P s rela" 
speed, numerical, motor coordination, inductive reasoning. Visualization, space adi- 
tions, science education, and aviation interest abilities. The AAF tests соп f 
vidually weighted and then combined to afford a composite score for pre vas ex- 
pilot, navigator, and bombardier training success. This composite ae ed ? 
pressed on a nine-point scale in standard deviation units and hence was 

stanine score. 


Da - 
- Ж S en - 
1 ——————— má. 


MODIFICATION-REVISION METHOD 511 


the School of Aviation Medicine, biserial correlations of .19 
7209) and – .02 (N - 950) with graduation-elimination from 
elementary pilot training were reported. 
he analysis of Throttle-Control Revision data for three 
heed aviation cadets afforded a normally distributed curve 
леш scores with a mean score of 18 as contrasted with the 
"Саве Complex Coordination score of 74. The odd-even relia- 
“iba constant was .74. Correlation between the Complex 
ono Test and its Throttle-Control Revision eee 
fani eer Relationship of the revision with the AAF pilot 
ЖОП Was 24 as contrasted with the Complex Coordination 


pilo р ; 1 i 
wid Pos correlation of .69. On the basis of its correlation 
nt 


Оте, n De s rel à 
Slon * Coordination Test. No validation data on the revi 


AAR total of twelve modifications and nine revisions of the 
Were omplea Coordination Test and Rudder Control Test 
T “veloped. The results of construction and analysis are 
tics -— below, and Table 1 presents the pertinent statis- 

нн in the pre-validation analysis. | рет 
Ого} е rite in the operation of the stick 2 iras 
relatio Xhibited a high consistency of measurement a 


3 ion and AAF 
atte tween such assessment of muscular tension ? 


1 l ry mea 1 
i. e I ressure scores show ed some communa ity 
П other measures of 


c аваг" 
ЧѕецТа other, moderate relationship wit! mn 

Perform. activity, and a slight tendency toward penn | 
ance inhibition, Hand and leg pressure during the 


9pera : 
Rug, of both the Complex Coordination Test and the 
Which ontrol Test was apparently a psychological function 


| 800d es unsampled by AAF measures and which per 
Nang Amount of face validity for aircrew prediction. 


a Tat 
"esae Modification of the Complex Coordination Te 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


512 


Le ig TE 080 — [^4 BL ` чохлу pnug 
prea 9c 61 91107 LE 19 7 Wow BUBYRIGAV3e ү, 
9r CU yr osr- 6€ 16 77^ WOISLADY, 3usur eds Qut e YL 
iY U OL apy- oc SL ttti UASA VI LIUN 
suoi aw 
AC Аб 19 +8 7777 uoneaypojy sseurpeaig-1io21? ү. 
9c Se Sr 99° 777 чоңцезуроү iusumAO[N-x»nugs 
£c 9c Ico $1 - £€ 06 7? лоптзутроуу зпәшәлоу{-1ерэд 
ZU 90° (aa Oe so” 16 s чоцеошщрор asseig-ieog 
£r 10 10 960 - 10° T6 ТТУ Uongogrpo]N 2unssaig-1004 
9r "qe ilo 0 - or 23 eene поцеойрор oxnssoxq-pue] 
SUOLVIYIDO PA 
[^H C£ 019) = 68" Isa], TOULNOD мзаапу JV 
Zr RU 0с 01+ - it ar 11711777 UOISIAS[ [onu07)-snoouej[nurg 
6C 6€ 15 01 0) IS 09° ВЫ *'' UOIsIAdY suononnsu]-Aiojipny 
[^ St 8H FO – 85° 69` ttt orsiaay Атошәрү-3ц8т-рәў 
AE 85 73" 99° as t77* UOISIAAY 2Jnsso1q-[013u07) 
А TC cc o6p- £V FL ttt111177** UOIsIAay PRUO- L 
- suosia y 
9c RE Өү 8£ zg {7 uontogipopy uorsuournq-ouit T, 
бе, £9 68° 49 . 11 цопезуирорү 3[easar) 
ҮГ 60° SL wa- £r 8^ SN et uoneagipo]y Aovinooy 
er 90 - 80 o91$p- £0 £6 11111 шоцтдуроу a1nsso1q-1e29 
Г 00° 90° 93 90° - T0 06 e Uone3grpo]y o1nsseiq-100,[ 
£I 10 80° 92 go" - or’ 68 77777 џоптоуіроу a1nsso1q-puerr 
suonvoyipo py 
69 1% 01909 48` Isa], NOILVNIGYOOD хятайогу дүү 
pounbar 9uruvis 
fpijvA зорі дүү 4ieneq avyr 353 01884 Auqeror 
шпашшгрү UOI1v[21107) uonv[21107) uonv[21107) чәлә-рро) 


(00Е = №) ғиолаәұ рир Yuonvoyipopy әү fo sonsuvig UONDpiypA-24q 
І XISVIL 


MODIFICATION-REVISION METHOD 513 


best satisfied the pre-validation criteria and was recommended, 
aS а test case, for validation analysis. Validation data as 
reported above were not conclusive. 

The movement of the limbs during the operation of the 
Rudder Control pedal and stick showed an adequate reliability 
and some relationship between the motility of the rudder and 
Stick and the AAF Battery measures. Modification scores were 
moderately related for hand and foot movements, and measured 
Unctions in common with the Rudder Control Test and the 
secondary performances of hand pressure and precision of 
target Coordination. Pedal motility was considered sufficiently 
2on-duplicative of current predictive measures to afford an 
independent contribution and thus to warrant a validation 
Study, A validity coefficient of .05 with graduation-elimination 
from elementary pilot training was reported by the School of 

Vlàtion Medicine. Е 
Tecision in pursuit coordination as measured by the ability 
aintain alignment of the follower and target was found to 
П aspect of basic test performance already inherent in ES 
Che test score, A study of errors in coordination was FP 
t sw the pursuit problems of the AAF Rudder ж. A 
It Met Pursuit Test, and the Two-Hand eue кшз 
that d € stated that the measurement of the н хе 
of behay target-follower contact is broken is a = i ue 

ne oF already accounted for in the basic test p T nh 
“Xception to this conclusion was found in the study 


tom 
еа 


as di: 
"a Contacts with the correct lights of the Complex TES 
E ton Test, Accuracy of matching bore no relation : 

"mber ed measures О 


arm-ha a patterns completed nor to рейин 
bein n Steadiness; and thus made the nature 
g Measured difficult to define. . А me. . 
Ме С, length of time spent in the пина а кы 
easy, Mation Test problems showed two charac E 
жы enr of the time spent on the total pattern анг и 
В tinge © the speed of reaction rather than the a 
tes sear Movement, and was found highly related to th c 
Upper 97е, easurement of the time spent ш аеш 
buti ^ nd of lights showed some probability of battery cor 


^ 


f the function 


514 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The Throttle-Control Revision, as previously оонай, 
required the subject to solve the Complex Coordination iam 
simultaneously with a pursuit problem. The new prob d 
situation preserved the Complex Coordination function, Ч 
added sufficient new abilities to require a minimum validity o 
21 for the revision to contribute to the predictive efficiency 
of the current battery. : 

The Control-Pressure Revision required the subject E 
criminate between and to counteract external pressures 1n p 
ation of the stick control of the Complex Coordination den 
Construction inadequacies resulted in the failure of the piii 
to meet the pre-validation standards. In view of the Es dder 
of the similar Machine-Displacement Revision of the A" as 
Control Test, further preliminary study of this revision 
recommended. А subject tO 

The Red-Light-Memory Revision required the си the 
recall the light positions in order to match the patent 9 ап 
Complex Coordination Test. The correlation of this E indi- 
with the basic test, stanines, and AAF battery peri Com- 
cated its enlargement of the functions measured by t рерна 
plex Coordination Test, but in the direction of already €x? 
predictors. 3 

The Auditory-Instructions Revision required th 
to match the Complex Coordination stimulus lights on 
number positions presented orally. This arr stanine 
ability was found moderately related to basic n: is effec- 
measures, but still capable of separate contribution m m 
tiveness of the battery. When confusion sounds ere was 
as a background to the auditory instructions, the rev! al of 
found unrelated to AAF aptitude measures, and сар? 
significant contribution. Ee d 

The Simultaneous-Control-Movement Revision cont 
the subject to operate the Complex Coordination rrelatio” 
simultaneously rather than serially. It showed low co orthY 
with current predictive measures and was apparent У 
of further investigation. atrol Test 

The Moving-Target Revision of the Rudder Co ts {0 


n 
> 3 в oveme 
required the subject to coordinate rudder-bar m 


to dis- 


e subject 
with the 
tua 


uir 
q rols 


MODIFICATION-REVISION METHOD 515 


follow a target across a horizontal path. Study of this increase 
in difficulty showed that it retained a good measure of Rudder 
aon performance, but that it also added sufficient new skills 
de er significantly the correlation between basic test and 
nine for efficient contribution. 

7а кы Machine-Displacement Revision required the subject 
Snr a between and to counteract pressures externally 
to displace the Rudder Control apparatus. The 

cien Control function again remained in sufficient amount 
Sie qe retention of its validity, and at the same time 
бош; tn cing decreased so as to allow battery contribu- 

1e revision showed a minimum validity of .15. 
lo l'arget-Sighting Revision required the subject to indi- 
ting paratus targar alignment by the depression of a gun- 
control utton mounted on top of the Rudder Control stick 
related. Accuracy in such visual perception was moderately 
still äp 2 a number of AAF Battery measures, but the revision 
Corre]; Peared capable of making a contribution to the multiple 
Th. en of the battery. І 
е Bank-Contro] Revision measured a function similar to 


at sam 
Cordis id in the Throttle-Control Revision of the б omplex 

ton Test, and exhibited comparable analytical data. 
Summary ` 


It 
1 terms of the 


T ma lification-revision method, 


: study of the moc 
sie Concluded that: 
instr ЖО 
for e struments of proven predictability ma 


Xt y be modified 
act 
149 10n of secondary performance scores ап 


d revised by 


Ing 
f 8 Seco i 
"nction, nd problems for the enlargement of basic test 
ch | 
Un} со В : 
Miversal] Neomitant behaviors and enlarged functions are not 


Dependi eing measured by the basic test score- | 
digs s Ing upon the nature of the specific modification or 
К ü ific r е 
"цуе е ch Measures may make a contribution to the pre 
Cienc 


y of a ba s; 
Prey, r, Value ttery of test 


of a modification or revision, W 
Appi, 9n ow and validation criteria, would be 
Paratus ith little additional expense in testi 


hich satisfies the 
be such a con- 
ng time and 


516 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


REFERENCES 


, » 
1. Freeman, G. L. “Suggestions for a Standardized ‘Stress Test. 
Journal of General Psychology, XXXII (1945), Sei inde 
2. Guilford, J. P. Psychometric Methods. New York: Mc 
Hill Book Company, 1936. : York: 
3. Luria, A. R. The Nature of Human Conflicts. New 
Liveright Publishing Corporation, 1932. , Head- 
4. Staff, Psychological Branch, Office of the Air Surgeon, Ta 
quarters Army Forces. “The Aviation Psychology ' PE XL 
of the Army Air Forces.” Psychological Bulletin, 
(1943), 759-769. . Head- 
5. Staff, Psychological Branch, Office of the Air Surgeon licies, 
quarters Army Air Forces. “Present Organizations | оору 
and Research Activities of the AAF Аманов, 1-55. 
Program." Psychological Bulletin, XLII (194 ^n. Head- 
6. Staff, Psychological Section, Office of the Air эш ‘cholosi- 
quarters Army Air Forces Training Command. Air Forces" 
cal Activities in the Training Command, Army Air 
Psychological Bulletin, XLII (1945), 37-54. gations 
7. Staff, Psychological Research Unit No. 1. "History, pce Air 
Procedures, Psychological Research Unit No. 105-114 
Forces.” Psychological Bulletin, XLI (1944), ent 0 
8. Staff, Psychological Research Unit No. 2 and Depar Progra” 
chology, School of Aviation Medicine. be » psycholog” 
ш Peye boniotor Tests in ou fae Forces. р. 
cal Bulletin, XLI (1944), 307-321. eseaTC y 
9. Viteles MS VL Airerafe Pilot: Five Years of е 
Summary of Outcomes." Psychological з 
(1945), 489—526. 


| PAE EFFECT OF BIAS DUE TO DIFFICULTY FACTORS 
| PRODUCT-MOMENT ITEM INTERCORRE- 
LATIONS ON THE ACCURACY OF ESTI- 
MATION OF RELIABILITY BY 
THE KUDER-RICHARDSON 
FORMULA NUMBER 20 


HUBERT E. BROGDEN 
War Department 

pig the formulae of the Kuder-Richardson (3) e 
Account ae is made that the item intercorrelations can De 
iticine tor by a single factor. Wherry and Gaylord (6) 
he Kuder-Richardson formulae for this reason, and 
d that when this assumption is not met in practice, 
las may result in the estimates of reliability provided 
EUN of the Kuder-Richardson series with the excep- 
E No. 2 which Wherry and Gaylord accept Le 
E » and which does not involve the assumption Ot а 
Or. The criticism of Wherry and Gaylord was di- 


I 
the 


Y the fo 
Чоп of 
funda 
мше f 
ec 
and At Possible bias due to content factors. Ferguson (2 
шыр and Gaylord (5) have stressed the fact that the 
Cate oe product-moment correlations between D. 
Woli E. Огу items are a function of the difficulty values for the 
Correlati S correlated. The variation in the magnitu 
Cable, 'On with the variation in difficulty can be quite appre- 
ons of i example, items having tetrachoric intercorrela- 


With рор "ave product-moment correlations varying from 59 
ith one point of 


f 

de of the 
Ut at Oth cuts Е, à wi 

| i t e 50th percentile, to - 1 І 

"tere s lóth and the dme а the 84th percentile. Since the 


Ог 
S re 
ele p 200п5 of the items which are assumed to be due to a 


ac : Ў 
Prody ton in deriving the Kuder-Richardson formulae are 
mption that they 


Сар р Mo Sum 
"he Ment, it is apparent that the E wir 
e met un 


Leo 
unted for by a single factor cannot 


517 


518 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 1 
Item Difficulty Distributions for Tests Labelled Normal in Table I 


Percentage correct 


Total - 93 | 97 
number | 03 | 07 | 16 | 31 | 50 69 84 
s 1 rect 
items Baseline values corresponding to given percentage сог 
0 
20-18 | =10 | os) 9 5 1.0 | TIES 
18 1 1 2 | 5 4 3 2 1 2 
45 2 3 5 8 9 8 5 H 4 
20 4 6 | 10 | 16 | 18 | 16 | 19 | i é 
153 6 | fo | a [3 | ae | ar | B 


items are of equal difficulty. However, the assumption E D 
torial homogeneity is involved in the K-R 20 formula Le 
a means of estimating the diagonal entries of the matrix ke cfi- 
is the numerator of the basic formula for the reliability рге" 
cient. Hence it is not immediately evident whether rol 
introduced by the failure to satisfy the assumption of a par- 
factor will be appreciable either generally or for e with 
ticular size and reliability. The present paper is ipe 

evaluating the extent of this error in K-R 20 coefficient: 


TABLE 2 -— 
K-R 20 and K-K 2 Reliabilities for Tests Having Designated Item Inte 


ч D Pus а T 

Item Difficulty Distributions, and Numbers of Д гә 
- М ink 
Assumed tetrachoric item intercorrelation: 


n 2 4 
K-R 20 | K-R2 | K-R 20 | K-R2 | K-R 20 | K- 


ation? 


Normal rectilinear skew 
© 
S 
со 
oo 
= 
со 
oo 
© 
io 
A 
ы 
io 
E 
M 
No 
o 
e 


EFFECT OF BIAS 519 


Inar : 

Variation тна (1) the author determined the effect of 

reliability of to m difficulty distributions on the validity and 
-R 2 eee has west scores. For the purpose of that article 

such that EE 20 were calculated. The computations wae 
these comput lo Ys could also be readily determined. All of 
choric interc апп: involved the assumption that the tetra- 

Moment амыг of the items were equal. Product- 

Puted for tw orrelations and standard deviations were com- 

referring to o-category items of specified difficulty values by 

then i hemor re correlation tables. These coefficients were 
- The oe in the reliability formula." 

Clents Mosis | SUR of items, for which the reliability coefh- 
: items, ө рс were computed, vary їп length from 9 to 
om 2 to g m he assumed tetrachoric intercorrelations vary 

SUR examined d RISE types of distributions of item difficulties 

ie difficulty n lu first being rectilinear in terms of baseline or 

nits; and the ie the second being normal in terms of these 

É Ince the reca eing skewed. 
ated in all ir mal distributions could not be exactly approxi- 

ü Tabie ау the distributions are listed in Table 1. 

d. 2 the K-R 2 and the К-К 20 reliabilities are pre- 


It is 


inf 
u А 
ene t seriously 


appare > 
aren 
ed by с general, that ће К-К 20 is no 
H e А + . 
ap ions. Fy sg bias in the product-moment inter- 
Parent the mond es those exceptional cases where bias is 
bie intercorrelati in item difficulty and the degree of assumed 
ati 
Ctua Practice on is much greater than would usually occur 


REFERENCES 


това 

еп, Н 

i ub is 

In the Dist Е, “Variation in Test 
istribution of Item Difficu 


Validity with Variation 
Ities, Number of Items, 


and 
2 Fey, XI ( us d Their Intercorrelation." Psychometrika, 
8uson, G )—in press. 
4 The Factorial Interpretation of Test Diff- 


3 
. K culty.” 4 
uder, СУр РзусЛотелтіка, VI (1941), 323-329. 
- and Richardson, M. W. “The Theory of the Esti- 


T Mati S 

^h T, "Oed E run Psychometrika, II (1937), 151-160. 

a а lent Tern „К. “Maximum Validity of a Test with Equiva- 
s. Psychometrika, XI (1946), 1-13. 


Se 01) 
for + 
а detailed discussion of method. 


520 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


5. Wherry, Robert J. and Gaylord, Richard H. “Factor Pattern of 
Test Items and Tests as a Function of the Correlation 
Coefficient; Content, Difficulty and Constant Error Fac- 
tors.” Psychometrika, IX (1944), 237-244. 

6. Wherry, Robert J. and Gaylord, Richard H. “The Concept of 
Test and Item Reliability in Relation to Factor Pattern. 
Psychometrika, VIII (1943), 247-264. 


О 
= 
— Á—MÀ* 


SOME SUGGESTIONS FOR THE IMPROVEMENT OF 
MACHINE-SCORING METHODS: 


E. K. TAYLOR 
Adjutant General's Office 

THE extensive registration of students at schools of all kinds 

as well as the large number of civil service examinations which 
E given in the next several years will undoubtedly greatly 
а the use of machine-scorable examinations. Ready 
тац вза by a large portion of the examinees of the separate 
ul the sheet will result from the fact that nearly all members 
their + forces аге exposed at least once in the course of 
жени ilitary career to an objective machine-scorable classifi- 

or placement test of one sort or another. 
of eet be no doubt that where any considerable number 
ih p examinations are to be scored or item-analyzed, 
rom th reased accuracy and intensive saving m time will result 
fos Wi use of scoring. machines. This is not meant to imply 
тея жеты machine is beyond improvement but cate 
"d tee hwhile savings may be realized by the use of certain 
inm oi :ѕ and checking procedures. The purpose of this paper 
Useful ; Sent several such procedures which the writer has found 
n machine scoring and item analysis. 


Test Administration 


prop] ting school populations, particularly in colleges, few 
Separat 5 in test administration arise as а result of the use of 
acquaint answer sheets. Most of the examinees are well enough 

ted with the procedure to appreciate the need for using 


Speci à 3 
Sp lal pencil and to refrain from marking more than one 


ace 3 
~ Per item. Unless, however, the test booklet and answer 


“Г 
тея, The opin; ; 
“есть the Dions expressed are those of the writer and are not to be construed as 
€ official attitude of the War Department. 
521 


522 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


sheet are especially designed, as for example in the case of the 
Kuder Preference Record, a need arises for the examinee to 
alternate his attention between the test booklet and the answer 
sheet without losing his place on either. The misplacement О 
a response becomes a serious problem particularly when the test 
is administered under a rigid time limit. As a result many 
examinees employ the simple expedient of resting their pencils 
on the answer sheet while reading the question in the booklet. 
Although most test directions instruct the examinees to rest the 
pencil on the item number rather than on the first response posi 
tion, such instructions are frequently ignored. Too often this 
results in small extraneous marks in the sensing area which are 
dificult to detect but adequate to conduct the current. To 
reduce the occurrence of such errors to a minimum it is advis?- 
ble to supply each examinee with a blank sheet of 82 by 11 inch 
paper to be used both for scratch and as a means of marking 
his place. In using a guide sheet it is advisable to print one 
side with examination instructions so that only one side will be 
used for notes thus precluding the possibility of transferring 
graphite from the guide paper to the answer sheet. " 
It should be remembered that the use of a separate ап" 
sheet in itself constitutes a simple coding test. Hence 5 pe 
in testing populations of low intelligence levels is question? ef 
It is the opinion of the writer that the use of separate апу! r 
sheets for the examinations of candidates for such position 
as those of hospital attendants, prison guards, etc» m 
advisable. 
Scoring t 


The ease and accuracy of machine-scoring objective pet 
nations varies with the level and experience of the grouP te е 
College populations accustomed to the manipulation of ает = 
answer sheets present few scanning problems. Adult POP ent 
tions, especially those of the levels described above, ar? nee or 
sources of scanning difficulty and present a far greater SET 
tion of answer sheets that must be hand scored than €? e 
populations. Acie 

To reduce scoring time to a minimum without | 
accuracy, the writer has found the following procedures i 


| 


MACHINE-SCORING METHODS 523 


1. Test papers are superficially scanned for obvious double 
marks and a red line is drawn through any omissions. Such 
omissions are counted and the total subtracted from the num- 
ber of items in the test. The number of attempts is recorded 
in some designated space on the answer sheet. During this 
Scanning, papers written in ink or made otherwise obviously 
unsuited for machine scoring are set aside. No attempt is made 
In this scanning to find anything but very obvious flaws. 

2. Papers are then sent to the scoring machine which is set 
up for final scoring. Papers are fed to the machine on which 
the appropriate scoring switch is set to read R +W. This read- 
ing is compared with the number recorded by the scanner. If 
both agree, the paper is scored. If the dial reading fails to 
agree (within a previously established margin of error) with the 
number of attempts recorded in scanning, the machine reading 
18 recorded and the paper laid aside for more thorough scanning. 

3. If the number of disagreements is small and the scoring 
formula is Rights it is a simpler matter to hand score than to 
re-scan. Where correction for chance is made or where a large 
number of discrepancies occur, it is generally advisable to rer 
Scan and to repeat the scoring process. After re-scanning, if 
the discrepancy does not disappear, hand scoring is indicated. 
. 4 Where correction for chance or other scoring formulae 
involving the scoring of wrong responses is employed and not 
all response positions in any active scoring field are used, an 


elimination key should be used. This is particularly true when 
а four- swer sheet is used for a true- 


false te here are more response posi- 


ч st or in any case in which t resp 
-tons on the answer sheet than there are alternatives in the test 


‘tems. When the number of items in the test 15 not a multiple 
af 15, not all of the response positions in the last active scoring 
eld on the answer sheet will be used. Here, too, the use of an 

“limination key is desirable if formula scoring is employed. 
5. Where the same template is to be used frequently and 
Particularly if it is an elimination key with large areas removed, 
as been found more desirable to use keys made of thin sheets 


о . : 
Plastic than to use regular paper stencils. 
Considerable difficulty has been encountered by the 


or five-response-position an 


524 Е 
DUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the answer 
e when the 
While it is 
lot by hand, 
It is prefera- 
d to allow the 
o the ma- 
answer 


а ee of the paper chute for stacking 
papers being ^ d est This is particularly tru 
possible to rem are not in excellent condition. 
hio method is bath nus paper from the scoring 5 
ble to leave the d clumsy and time-consuming. 
papers to eiae to the storage space open an 
chine. The E a m placed on the floor next t 
sheets has been ien cardboard box used for shipping 
7, еа et Dux satisfactory for this purpose 
pasdicular order (e - answer sheets be returned in any 
TAN bip Aa : a phabetical order by the examinee 
ments after scorin o ies to make the necessary arrange- 
papers arrive for pee than before. If for some reason 
E ak ring in the exact order in which they are 
them cónsecutively 2 4 less time-consuming to number 
to attempt to Sen to rearrange them after scoring than 
scannin атса the required order throughout the 
the le be ч ел ш respond to a 
scoring is unity ҮЕ rights scoring and correction-for-chane® 
and the number of а iie ше number of omissions is excessive 
difta canas nn small (as in true-false tests) the 
ihe tme атан Ee methods of scoring does not just! y 
duced by formula p Tae opportuiicy for inaccuracy 
based on subliminal ring. Since guesses” are 
chance, it is es es and “hunches” rather than ОЮ Fr. 
nf icis di ee le if formula scoring should be used ! 
Miss be E ү ина: of the writer that examinees € 
galt caer to make some response to each item 1n je 
circumstances in s am should be scored for rights. The por 
tive scoring is iue а the writer believes that the use of d" 
item analysis ndi ed are those in which a multiple-choice te 
cantly eae than certam of the responses sign! 
e on some valid criterion. 


ll of the items 


2 


E Weighting 

ile t i justi i 

e here appears to be little justification for the 
ms in most objective tests where a large num 


weight 
ber of 


eC no: 


MACHINE-SCORING METHODS 525 


items are used, this procedure may prove useful on short tests 
or when two separate tests are to be assigned regression weights 
in a multiple-correlation prediction and are taken on a single 
answer sheet. The methods given in the IBM Manual (2) 
generally require several runs through the machine. Those 
developed by Grossman (1) not only require a special answer 
sheet but materially reduce the number of items to which re- 
sponses may be made on an answer sheet. Below are presented 
two methods of weighted scoring which, while they do not pro- 
vide the scope of Grossman’s solution, will be adequate for 
simple weighting problems. Regular IBM answer sheets are 
used in both cases, and no change from ordinary administrative 
Procedures are entailed. No reduction in the number of items 
Per surface of the answer sheet is involved. Either method is 
adaptable to multiple weighting by running the papers once 
for each two weights employed. This reduces by 50 per cent 
the amount of machine time required in the scoring of almost 
all weighted tests. Both methods are essentially adaptations 
of Rulon’s (4) technique for simplifying split-half reliability 
€terminations. 

1. Subtraction Method.—This method is applicable in cases 
Where the responses are weighted unity and some small number 
not in excess of four. Incorrect responses are considered as 

aving zero weight. 

If the responses having a weight of zero are subtracted from 
the total number of responses made, each of the remaining 
responses is automatically given a weight of unity. The prob- 
ST then resolves itself into one of separating the weighted 
stems into two groups; those having a weight of unity, and those 
dm Some other weight, for example, cUm е кг 
: уы: at of assigning one point to each of t ee ira e 
tini T scored those items having unit weight. ess * 
Weighs as also been assigned to those responses to whic a 

of four is to be given. All that remains then is to add 
Тее points for each item to be weighted four. 
unching of the templates to accomplish this purpose is 


done as follows: 


. [1j » 
a) Responses having the weight zero are treated as “wrong 


526 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


responses; 1.е., no punch is made in either the “rights” or the 
“elimination” key for these responses. 

b) Responses having the value one are eliminated from 
scoring; i.e., these response positions are punched in the “elimi 
nation” template but not in the “rights” template. 

c) “Weighted” responses are punched as “rights” 
response positions are punched in both keys. 

d) Both the “right” and “wrong” field position holes are 
punched. 


: 1:65, these 


The field selection dial is set to the proper field, and its СОР, 
responding reading knob to the К – W position. Taren 
rheostat is set to unity and the “rights” rheostat is set tO N Es 
where N is the numerical value of the weight. The rea 
secured from the “R — W” position is thus (N-DR -W. 
score added to the total number of items attempted yields the 
desired weighted score. 

2. Addition Method Ап alternative method © 


the same result may be accomplished by the following te™ 
punching: 


f achieving 
plate 


nated; 


a) Those responses having the weight zero are elimi! the 
in 


Исе the appropriate response positions are punched 
elimination” template but not in the “rights” template. : 
b) Those items having one weight (between 0 an 3) pe 
treated as “wrongs”; i.e., these response positions are P 
punched in either stencil. as 
c) The items having the second weight are punched i 
"rights"; i.e., these response positions are punched in both d 
d) The “wrongs” position is punched for the conce!” 


field “А” holes. 4 


А Р fie 
e) The “rights” position is punched in the concerned 
“B” holes. 


The rheostat and knob settings are as follows: В 


a) With the field selection knob set at “A” and ast 
field knob set at “W,” the field “A” rheostat is set for M 
ing, where “М? is one weighting factor. ie “В” 
b) With the field selection knob set at «B? and th 


MACHINE-SCORING METHODS 527 


field knob set at “R,” the ficld “B” rheostats are adjusted to 
read “N,” where “N” is the other weighting factor. 


Two readings are required for each paper. These may be 
made successively during a single run of the papers. The “A” 
and “В” field knobs are set at “W” and “R” respectively. These 
settings are retained throughout the scoring. The field selec- 
tion switch is the only one manipulated in running the papers. 
For each paper scored, a reading must be taken in the “B” as 
well as in the “A” field. The sum of these readings yields the 
desired score. 

The chief differences between the two methods are the facts 
that (1) the subtraction method requires a preliminary scan- 
ning of the papers and the recording of the number of attempts 
made on each paper and (2) the addition method, on the other 
hand, requires the recording of two scores on the machine. The 
Subtraction method is restricted to three weights, two of which 
are unity and zero. The addition method requires only that one 
of the three weights be zero; the other two may be established 
а8 required by the situation. Fractional weights and multipli- 
a of final score by a constant yield any desired pair e 
ане . Both methods require a simple operation in WU 
Wing "ow after machine scoring. The subtraction metho ‚їп 
«ntur y one feld, has the further advantage that in experi- 
group aa three different values of N, all applying to the same 

items, may be employed at the same time. 


Graphic Item Counting 
wer? comparisons recently reported by McNamara and 
tzman (3) clearly demonstrate the saving to be realized 
ch the use of the Graphic Item Counter. The small additional 
ац made for this device should insure its inclusion in every 
to be g machine to be used where item analysis of any sort 18 
ie of the procedure. It has been the experience of the 
tom i the trained operator can record ninety responses 
graphs papers in from 7 to 10 minutes. The reading of the 
gures and the wiring of the boards are not considered in е 
and r ` Several short-cuts in feeding papers, wiring boards 
ading charts have been developed and are reported below. 


528 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Paper Feeding 


Since the speed of the recorder is constant, the only saving 
in time that can be realized in this part of the procedure 15 17 
the feeding process. As the operator has nothing to do while 
the responses are being recorded, this time can be utilized 1n 
pre-positioning the next paper to be inserted. То accomplis 
insertion with the least effort, especially when the papers 47° 
not in perfect condition, the paper to be inserted into the 
machine is slipped under the protruding edge of the paper 
being read. When the first paper is released, the second W! 
slip easily into the reading slot. 

After the reading unit has passed about half way 
course, depression of feed lever will not open the scoring 
until the reading has been completed. Thus, while the оре! id 
pre-positions the answer sheet with his right hand, he shou 
depress the feed lever with his left, thus releasing the one 
paper as soon as possible. Releasing the lever, dropping m- 
pre-positioned paper and depressing the feed lever agam e 
pletes the cycle for a single paper. Some practice 15 ber ч 
for the operator to develop the synchronization necessary iS 
successful accomplishment of this procedure. 


y on its 
slot 
ator 


Wiring use 

Three types of analyses are generally accomplished by he 
of the Item Counter: attempts, rights and alternate ghen 
maximum number of recordings in a single run is 
attempts or rights are counted, two runs and two 
required to analyze the 150 items provided for on each $ ret 
the answer sheet. When both sides of the answer mem board 
be analyzed, no additional wiring is required. А universa is 06 
which will serve for all “rights” and “attempts” counts i 
scribed in the IBM Manual. 


A commoning stencil is used for all runs. 


i re 
wirings 2 f 


» rr 
plates are used to supplement the commoning key. = jt 5 
analysis of the first 90 items, a rights template for wn i 
is placed between the commoning key and the dy t Ie. 
ere 


o 
To analyze the remaining 60 items, it is necessary ™ 60 деп 
place the first template with one punched for the last 
only. 


| 
| 


MACHINE-SCORING METHODS 529 


, Attempts analysis may be similarly accomplished. A scor- 
ing template is cut in half in the space separating the first six 
scoring fields from the last four. The latter part of the stencil 
is used in analyzing the first 90 items and the former in ana- 
lyzing the last 60. 

The same board may of course be used for “alternative” 
analysis. Ten templates are required for the complete analysis 
as outlined in Table 1. 


TABLE 1 


Description of Templates to be Employed in the Use of the Universal 
Plugboard for Alternative Analysis of 150 Items on the 
Graphic Item Counter 


Template Response Items 
number position number 
1 A 1-90 
2 B 1-90 
3 С 1-90 
4 D 1-90 
5 E 1-90 
6 A 91-150 
7 B 91-150 
8 С 91-150 
9 р 91-150 
10 Е 91-150 


. The above method yields a vertical analysis; i.e., each run 
PA the count on the same response position for a number of 
a While this is generally acceptable in certain situations, 
d desirable to have the count of the several items recorded 
іп adjacent positions on the item count sheet. A universal 
oard for this type of analysis is also possible. Thirty nine- 
EE and sixty eight-prong multiple wires are required. These 
2 Plugged as demonstrated in Tables 2 and 3. Nine templates 

Tequired for the complete analysis. In the first template, 
Tesponse positions for the first 18 items are punched out. 
sea "''Sbonse positions for items 19 to 36 are punched out е 
on. ПЧ stencil, Т | il, items 37 to 54 are punched 

Ut and nc. In the third stenet [ the fourth stencil 
Items E 55 to 72 are punched Qut © den Fighteen 
five, oi 9 90 are punched out of the fifth stench’, e. -g for four 
Tes Se items are analyzed per run. Similar WIPE, 16 sie 

`e Items can be accomplished with 72 seven ? 


530 


TABLE 2 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Number of Item to which Each Prong of Multiple Wires Is Plugged in 


5-Response Universal Item A nalysis Board 


Prong 
Wires 
number 1 2 3 4 5 6 7 8 9 
Items to which prongs are plugged 
19: 37 55 75 91 109 127 15 
20 38 56 74 92 10 128 He 
1 39 57 75 з ш 19 19 
2 40 58 76 9: 12 10 Н 
23 41 59 77 5 15 DI j 
-24 42 60 78 96 11h 132 > 
5 43 6{ 79 97 115 13 -e 
26 44 62 80 98 116 134 +s 
27 45 63 81 99 117 135 
28 46 61 82 100 18 136 
29 47 6 83 101 19 137 
30 48 66 & 100 120 138 
31 49 67 85 103 121 139 
32 50 68 86 104 12 140 
33 51 69 87 105 1233 MI 
34 52 70 88 106 124 142 
35 53 7 89 107 15 IB 
36 54 72 90 18 126 14 
TABLE 3 


Response Positions to which All Prongs of Numbered Wires Are 
Plugged for Items Shown in Table 2 


Response position 


A B C D B 
1 2 3 4 5 
6 7 8 9 is 

11 12 13 14 20 

16 17 18 19 25 

21 22 23 24 30 

26 27 28 29 35 

31 32 33 34 10 

36 37 38 39 45 

41 42 43 44 50 

46 47 48 49 $$ 

51 52 53 54 20 

56 57 58 59 65 

61 62 63 64 70 

66 67 68 69 75 

71 72 73 74 30 

76 77 78 79 85 

81 82 83 84 90 

86 87 88 89 


| 
| 
| 


~ 


MACHINE-SCORING METHODS 531 


prong multiple plug wires. Seven templates are required and 
22 items analyzed per run. 

To facilitate the reading of the graphic item count record 
sheet, the writer suggests that they be overprinted with thin 
vertical lines marking off each item. This, it has been found, 
speeds up reading and naturally reduces errors of recording. 

Even when no re-wiring is required, as when using any of 
the above procedures, it is advisable that the board be checked 
each time the plug-board templates are changed. This is to 
assure the proper placement of these templates. The method 
of checking advised requires as many check sheets as there are 


o 
N 


FIGURE I 


App 
EARANCE ОЕ A Роктіом or THE Curck SHEET ох A Property Wiren BOARD 


al A : $5 
ternates to each item. Sheet 1 should bear marks in positions 

А the items to be analyzed in that run. Sheet 2 should bear 
arks on response positions 2, etc. In testing, sheet 1 is run 


1 à 
hrough the machine once; sheet 2 twice, etc. The result of 


Ps Tun will yield a series of right triangles, as illustrated in 
à Sure І. Departures from this pattern become immediately 
PParent and indicate the source of error. For the application 
this checking method to rights analysis, the Manual should 

© referred to, 


Sampling 
The greatest time-saving device, employable in either hand 


o Е 
у chine analysis, is the sampling of the population so as to 
danN that is a multiple of 100. Most frequently the loss 


532 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


in the number of cases used will be more than compensated for 
by the saving realized in computational time. 


REFERENCES 


1. Grossman, Sgt. David. “Technique for Weighting of Choices 
and Items on IBM Scoring Machine.” Psychometrika, 
(1944), 101-105. а 

2. Manual of Instructions for the IBM Test Scoring Machine. ^n n 
cott, N. Y.: International Business Machines Corporation, 
1943. 

3. McNamara, Lt. W. J. and Weitzman, Lt. E. “The Economy of 
Item Analysis with the IBM Graphic Item Counter. 
nal of Applied Psychology, XXX (1946), 84-90. he Relia- 

4. Rulon, P. J. “А Simplified Procedure for Determining the. ». 
bility of a Test by Split Halves.” Cambridge, Mass.: К 
vard Educational Review, IX (1939), 99-103. 


» 


| A SHORT-CUT METHOD FOR о AND r 


WILLIAM LEROY JENKINS 
Lehigh University 


| Bv THE short-cut method described below, the standard 
deviation (c) of a set of raw scores can be estimated quite accu- 
rately without plotting or grouping into step intervals. The 
| Coefficient of correlation (7) between two sets of paired scores 
|. €àn also be quickly found without plotting. Empirical tests 
| Indicate a mean discrepancy of only 3% between short-cut o’s 


, апа ә? and those computed by the usual methods. ‚ 
Short-Cut Method for о " 
1 1. Select by inspection the highest 10% of the scores and 


а their mean. (For example, if № = 100, take the ten highest 
ninth} If N = 87, use the eight highest and seven-tenths of the 
2. Select by inspection the lowest 10% of the scores and 
nd their mean. ` 
| 3. Divide the difference between these two means by 3.5 to 
` Set the standard deviation (c). (The difference between the 


» TABLE 1 
Sample Solution for б v 
А L 
List of raw scores (N=50) Hit а iov. 
7 87H 54 55 74 83 14 
19 4L 2. а 8 87 4 
83 3 281 48 66 92 28 
| 2; 49.4 32 55 58 86 27 
uy, 74 62 85H 48 85 
| 69 70 86H 61 52 5) 433 5) $0 
59 c8 72 275 € M= 866 M =16.0 
7 20 35 56 69. Difference = 70.6 
37 92H 071, a7 31 o = 70.6/3.5 220.2 
3 73 78 40 (computed c 720.4) 


534 -EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 
е” ‘J 


TABLE 2 


Sample Solution for r 


Raw scores: ái ) 
D бу (short-cut 
x ч y (х-у) 83 1 428 -48 = 380 
83 66 17 84 18 38 
67 61 6 81 10 380. 76 
EH E L 24H 100 5 й 
4 -22L 80 14 = 
г а и FS — Dm vm 
84H 80H 4 
72 “4H -2 oy (short-cut) 254 
69 " 45 24H. 80 12 417-687 
1 21 -20 14 9 
34 50 -16 D 16 38. 69.8 
81H 9H -I 74 11 
75 51 24 H 97 15 98. 1994 
66 57 9 Ee "ebd 73.5 (computed 
51 74H -23L 417 68 20.40 
Be 56 -31 L 
24 ML 10 Я 
100H 97H 3 Gp (short-cut 137) =253 
56 60 - 4 24 e -l= 
24 34 -10 4 = BS 281 _ 
20 56 — -36L 2 cH Se mW 
18 L 20 - 20 = 36 
56 41 15 24 = 25 506 . 1446 
54 m 64 -10 кы pa “3.5 (computed 
80 73 7 -137 . 
78 64 14 “6 
48 58 -10 и 
46 32 14 r (short-cut differences) 
ж я: уре 
Dr і -6 ке miam 
5 1. 27 -22 "i 09.1 
l4 L liL 3 470.9 + 398.0 202.2. 
24 17 7 =—752171х199 
61 50 11 Iga 
62 54 8 =.763 (compute 
50 50 0 " 
70 65 5 
6 60 9 
64 64 0 " 
35 iL 20H 
39 34 5 
42 44 - 2 
43 68 -25L 
4l 47 - 6 
66 66 0 
47 42 5 
62 38 24 H 
66 67 - 1 
44 47 - 3 


SHORT-CUT METHOD ^ = 535 


means of the extreme tenths of a normal distribution is 3.51 о.) 
Table 1 shows a sample solution. 
Short-Cut Method for r 

1. Calling the two distributions x and у, find the difference 
(x-y) between each pair of scores. — 

2. By the short-cut method, find c for x, for y, and for 
D(x-y). А 

3. Substitute in the formula: - 

E oj to – ор? 

2 o: Oy 

Table 2 shows a sample solution. Note that the same 
numerical answer is obtained by using the differences in the 


means directly, instead of converting them into c's. 
ri è 


Results of Empirical Checks een 


Eighty samples of 50 and forty samples of 100 scores were 
drawn at random from a normal distribution of 1000 scores. 
he standard deviation of each sample was computed in the 
Standard way (using 21 step intervals) and also by the short- 
cut method. Table 3 shows that the standard errors of the 
short-cut о? are not substantially greater than those of the 
computed с?з. 


TABLE 3 


Comparison of Computed and Short-Cut o's 
(Population о = 20.5) 


N=50 N=100 
Empirical standard error of computed o's .... 1.82 * 1.03 
™pirical standard error of short-cut o's .... 1.91 р“ 120 
Miren standard error ә аъ сик: sis эж 205 è 145 
ап discre| b ding com- b 
е В 
M (3.1%) (2.3%) 
ean value of short-cut 076 ..... ie 19.8 а 20.2 


yes Ж samples of 50 pairs and twenty samples of 100 pairs 
€ drawn at random from a population of 1000 paired scores. 

: s Coefficient of correlation of each sample was computed by 
ч Standard technique (using 21 step intervals for each distri- 
Чоп) and also by the short-cut method. Table 4 shows that 


v 


* 


536 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the standard errors of the short-cut r's are not substantially 
greater than those of the computed 7’s. 


TABLE 4 


Comparison of Computed and Short-Cut rs 
(Population r= .764) 


N=50 N= 100 
Empirical standard error of computed z's .... 071 035 
Empirical standard error of short-cut 7’s .... .074 048 
Theoretical standard error ................ 059 042 
Mean discrepancy between corresponding com- 2 
puted and short-cut rs ................. ‚025 023 
(33% (M 


The short-cut methods for ¢ and r are particularly recom 


3 ch 
mended for use by students, because of the ease with whi 
: :me-savin 
errors can be detected. The methods also provide a ep 
wi 


technique for the research worker who is not blesse 
modern computing equipment. 


> 


put 


meme 


MEASUREMENT ABSTRACTS: 


Alper, Thelma С. “Task-Orientation vs. Ego-Orientation in Learn- 
ing and Retention.” American Journal of Psychology, LIX 
(1946), 236-248. 

_ Forty undergraduates, twenty in a task-oriented group and twenty 

in an ego-oriented group, were presented with a series of twenty items 

under varying conditions to test three classical laws of learning and 
retention. The results are summarized as follows: Law 1, "Immedi- 
ate recall is superior to delayed recall,” holds only under conditions 

ОЁ task-orientation. Law 2, “Intentional learning and retention are 

Superior to incidental learning and retention," holds only under con- 

ditions of inactive task-orientation. Law 3, “Motor activity facili- 

tates learning and retention more than does inactivity,” holds only 
under conditions of task-orientation in the absence of explicit instruc- 
tions to learn. Suggestions are offered, on the basis of a trace theory 

9! learning, in explanation of the fact that ego-oriented traces are 

Superior in stability to task-oriented traces. Frances Smith. 


Altus, William D. and Mahler, Clarence A. “The Significance of 
erbal Aptitude in the Type of Occupation Pursued by Illiter- 
ates" Journal of Applied Psychology, XXX (1946), 155-160. | 
с Ina study of 2,476 illiterate trainees made at the Ninth Service 
оттап Special Training Center, it was found that when average 
andard scores on four verbal subtests of the Wechsler-Bellevue 
Males were computed for skilled, semi-skilled and unskilled white and 
кы groups, skilled and semi-skilled workers were reliably brighter 
d an unskilled. A further study, based on the extremes in teste 
aptitude, showed three times as many skilled whites and almost twice 
a Many skilled Negroes scoring as high as the brightest ten per cent 
the total group, as was true of those scoring with the lowest ten 
Per cent. On the basis of these and similar findings, 1t 15 recom- 
tended that the shortened form of the Army Wechsler employed in 
ese studies be used in discriminating between abilities of illiterates. 


dr * 
ances Smith. А 


y plicati <ills Tech- 
тап, Mary К. “Studies in the Application of Motor Skills Tec 
A ques to the M aed Adjustment of the Blind.” Journal of 
Applied Psychology, XXX (1946), 144-154. è d 
Plac seking to apply psychological measures to the pro j 


*ment for the blind, the Trainee Acceptance Center in Phila- 


1 . 
Edited by Forrest A. Kingsbury. 


— 


538 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


delphia set up a battery of tests of motor skills and mental ability 
and administered them to 312 legally blind (up to 20/200 vision) 
persons. Rough standards of success in industry were established 
through the cooperation of gainfully employed blind persons who also 
took the tests. Learning curves of the blind were compared wit 

those of seeing persons for validating purposes. The results from the 
various tests were intercorrelated to determine whether a single 
ability or a group of abilities was involved. Findings show that, while 
tests can greatly assist the experienced clinician, the ultimate point 
of reference is the individual: his background, personality and moti- 
vation; and that generalizations from a group study such as this are 
inadequate because guidance and placement work deals with indi- 
vidual men and women. Vernon S. Tracht. 


Brozek, J., Guetzkow, Harold, Mickelsen, Olaf, and Keys, Ange 
Motor Performance of Normal Young Men Maintained on lied 
stricted Intakes of Vitamin B Complex.” Journal of ApP P 
Psychology, XXX (1946), 359-379. ; 
, In a University of Minnesota study of the relationship be 
intake of B-vitamins, particularly thiamine, and psychomotor e 
formance, eight “normal” men, 20 to 32 years of age, were maint acts 
for 161 days on a partially restricted dict, with four of the зиз} 
receiving а daily supplement of B-vitamins. There followec ed in 
on a diet practically free of B-vitamins, with subjects re-grour ae , 
four pairs as follows: restricted-deficient, restricted-supplen® „s of 
supplemented-restricted, supplemented-supplemented. Геп ме 
thiamine supplementation concluded the study. Psychomotor : 


. B eec 
urements during the study included two strength tests, Es j-and- 


small hand-movements, gross body reaction time, manual, spect” in 
coordination, and precise coordination. Results are discuss ita 
detail for each test, with the general conclusion that in acute s, but 
min deficiency deterioration affects all psychomotor function: 


that the degree of deterioration varies. Frances Smith. 


tween 
per 


inn 

Burton, Arthur and Bright, Charles J. “Adaptation of t Шер 
sota Multiphasic Personality age for Group Ай р x 

tion and Rapid Scoring." Journal of Consulting Mags 
(1946), 99-103. і 
The authors present a method of reducing erroi буроо, 
fatigue, and conserving time and expense in scoring the M 
Multiphasic Personality Inventory. The basic scoring pum, ji? 
served, but the 550 items are printed upon pre-punched Inte pi 
Business Machine tabulation cards according to an аг a 4 
These cards, after having been sorted by the examine?» vento"; 


y n 
scored by the IBM machine for the eleven scales 1n Ro "fte pe 
By this process, hand-scoring time may be reduce foe d på 
thirty minutes to a minimum of four minutes by mae catio а? 


inventory is made more adaptable for large-scale e" 
industrial use. Harold Mosak. 


"a 


A A 
i > тте 


MEASUREMENT ABSTRACTS 539 


Cattell, R. B. “Personality Structure and Measurement. II. The 
Determination and Utility of Trait Modality,” British Journal 
of Psychology, General Section, XXXVI (1946), 159-174. 
Psychologists have classified traits as dynamic, temperamental, 
and cognitive without explicitly defining the way these distinctions 
are made. To clarify these definitions, the writer suggests that (1) 
measures of dynamic traits respond to changes of incentive, (2) 
measures of abilities respond to alterations in complexity of the path 
to a goal, and (3) measures of temperamental traits respond the least 
to any changes in the field. Two methods—one for single variables 
and one for factor problems—are presented by which dynamic traits 
can be operationally distinguished from ability traits. The practical 
and theoretical values of making modality distinctions and working 
with pure traits arises from the fact that incentives or complexities 
can be controlled independently in many everyday situations. Fred- 


erick Gehlmann. 


Edwards, Allen L. “A Critique of ‘Neutral’ Items in Attitude Scales 

Constructed by the Method of Equal Appearing Intervals.” Psy- 

chological Review, LIII (1946), 159-169. — . : 

The analysis of some of the "neutral" items included in attitude 
Scales constructed by the method of equal appearing intervals seems 
to establish that these “neutral” items tend to be non-differentiating. 
The items in the neutral zone tend to be relatively ambiguous and 
irrelevant, and may express attitudes of “indifference” and attitudes 
of “ambivalence.” For practical purposes the writer holds that the 
summated rating scales are preferable to the method of equal appear- 
ing intervals in attitude measurement. Irene P. Robinson. 


Ellis, Albert. “The Validity of Personality Questionnaires.” Psycho- 
logical Bulletin, XLIII (1946), 385-440. — . 
his paper reviews available objective validity studies under the 
headings of Behavior Problem Diagnosis, Delinquency Diagnosis, 
Sychiatric or Psychological Diagnosis, Rating Diagnosis, Test Inter- 
Correlations, and Over-rating or Lying Validations. Only question- 
naires of the Woodworth, Thurstone, and Bernreuter type ате con- 
sidered, with experiments using the Minnesota Multiphasic Test, = 
ап individually administered questionnaire, considered separately. 
Summary of results obtained from studies made under the various 
€adings indicates that group-administered personality Esc 
naires of the type indicated are of dubious yalue in distingue hing 
tween groups of adjusted and maladjusted individuals and of even 
‘ess value in individual diagnosis. More research in the direction a 
Individually administered questionnaires is urged. The paper includes 
E bibliography of 360 titles. Frances Smith. 


Estes, St iati Wechsler-Bellevue Subtest Scores 
› | Ж... s of Wechsler-be : 
rom ‚=з d in бирай Adults." A of Abnor. 
mal and Social Psychology, XLI (1946), 226-228. — , . when 
vidence ig presented in support of the contention 


540 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


patterns of deviations of subtest scores from vocabulary level on the 
Wechsler-Bellevue scales are being used in differential diagnosis 0 
personality disorders, a correction for normal deviations is require 

in cases where vocabulary level and educational and occupational his- 
tories indicate a pre-maladjustment IQ of 110. Rapaport’s assump- 
tion that in the well-adjusted person there should be little discrepancy 
among subtest scores or little deviation from vocabulary level is ques- 
tioned at this point. Vocabulary scatter for 102 college students an 
recent graduates with a mean full scale IQ of 127 is analyzed, with 
deviation scores on the Picture Arrangement and Object Assembly 
subtests shown as particularly indicating the need of correction or 
normal scatter in superior adults. Frances Smith. 


Glanville, A. D., Kreezer, G. L., and Dallenbach, K. M. “The Effect 
of Type-Size on Accuracy of Apprehension and xi d Та), 


izing Words.” American Journal of Psychology, LU 

220-235, | 

This study was divided into two parts: (1) A labora 
to determine the accuracy of apprehension of two type-size 


tory study 
s of stimu 


" [i an 
lus words (6 or 12 pt.) with 60 and 210 m. sec. exposure pcd 
with blank and printed backgrounds; the results showed a const yw. 


and reliable difference in favor of the 12 pt. type under all condition 


and the background had little if any effect. (2) A practical I The 
dictionaries using the two type-sizes for vocabulary-wore Ss more 
majority of subjects (50 adults and 50 school children) req б pt 
time to locate vocabulary-words set in the 6 pt. rather than thé asier 
type. A majority of the Ss reported the large-type dictionary ё 

to use. Irene P. Robinson. 


o 

Graham, Frances K. and Kendall, Barbara S. “Performan” al 

Brain-Damaged Cases on a Memory-for-Designs Test. T 

of Abnormal and Social Psychology, XLI (1946), 303-31 abilitY 
. Testing the hypothesis that impairment of visual-mot? ory- 
IS an indication of brain damage, the authors gave а me patients 
designs test to an experimental group of 70 brain-damage jd popu 
and a control group of 70 persons. The latter were also fron’ 5 сате 
lation of patients (not similarly afflicted, however) having А “formes 
age range and educational and ocupational background as t пе the™ 
group. Results showed significant mean differences 
Impairment, as indicated by the test score, was rare chia 
group (occurring only with feeblemindedness or severe Lech in che 
disorder), while it was more frequent (50 per cent of the саве. er, of 
experimental. Although asserting that the differentiating Pj which 
this test is not as good as those using the “higher functions, -e as г 
presumably suffer most when the brain is injured, they regar yerno" 
short, easily administered means of detecting brain damage 


S. Tracht. 


ra 


MEASUREMENT ABSTRACTS 541 


Gulliksen, Harold. “Paired Comparisons and the Logic of Measure- 

ment." Psychological Review, LIII (1946), 199-213. 

Recent discussions of the logic of psychological measurement have 
overlooked important developments in the theory dealing with the 
method of paired comparisons. This method in both the one-dimen- 
sional and multi-dimensional case has scale values that (1) are not 
dependent on the particular population of objects chosen, (2) are 
not dependent on any arbitrary defined relationship, and (3) by sub- 
tracting any one scale value from another, give the results of an 
experiment involving only the two objects. Hence, this method satis- 
fies Campbell’s criteria for an extensive scale, if subtraction is substi- 
tuted for addition. Certain similarities between paired comparison 
and some types of physical measurement are discussed. Frederick 
Gehlmann. 


Guttman, Louis. “The Test-Retest Reliability of Qualitative Data." 
Psychometrika, XI (1946), 81-95. t | 

The test-retest reliability of qualitative items, such as occur in 
achievement tests, attitude questionnaires, public opinion surveys, 
and elsewhere, requires a different technique of analysis from that of 
quantitative variables. Definitions appropriate to the qualitative 
case are made both for the reliability coefficient of an individual on 
an item and for the reliability coefficient of a population on the item. 
From but a single trial of a large population on the item, it 1s possible 
to compute a lower bound to the group reliability coefficient. Two 
kinds of lower bounds are presented. From two experimentally inde- 
pendent trials of the population on the item, it is possible to compute 
an upper bound to the group reliability coefficient. Two upper 
bounds are presented. The computations for the lower and upper 
bounds are all very simple. Numerical examples are given. (Cour- 
tesy Psychometrika.) 


Hartmann, George W. “The Effects of Noise on School Children." 
Journal of Educational Psychology, XXXVII (1946), 149-160. 
Anticipating a sharp rise in school building programs in the post- 

War era, and believing that architects and educators must cooperate 

in the elimination of unnecessary noise, this author briefly reviews the 

Iterature pertaining to the problem. Included is a discussion of the 

alleged ill effects of school noises, the methods of measuring noise, 

the relatively few experimental setups comparing pupil performance 
1n quiet and noisy settings, other laboratory findings and supporting 
industria] investigations. The evidence from these various sources 

Indicates that efficiency in all kinds of mental effort is definitely low- 

ered by persistent, annoying or distracting sounds. Vernon S. Tracht. 


Heath, 5, R « Found in Motor Deviates.” 

‚ S. Roy, Jr. “A Mental Pattern Found in otor 

Journal "i b oma and Social Psychology, XLI (1946), 223- 
25 


This describes the mental characteristics of a type of individual 


542 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


receiving little attention in the literature but often found by psy- 
chologists in the armed forces. Such a person, although within the 
normal range of intelligence, has noticeably poor muscular coordina- 
tion and exhibits a mental profile of normal crystallized ability and 
relatively lower fluid ability. The author defines the terms “crystal- 
lized” and “fluid” and gives a sample problem to illustrate how these 
abilities can be observed from almost any standard psychometric 
examination. Because of his consistently delayed reaction time In 
both muscular and mental activity, the motor deviate must be given 
special consideration as regards his educational, social and occupa- 
tional adjustment. Vernon S. Tracht. 


Holzinger, Karl J. and Swineford, Frances. “The Relation of Two 
i-Factors to Achievement in Geometry and Other Subjects. 
Journal of Educational Psychology, XXXVII (1946), 257-265. 
battery of eight spatial and three other tests was administere 
to 183 pupils in plane geometry classes to determine the value of two 
i-factors, spatial and general deductive, in predicting achicvement 17 
geometry. At the end of the school term the American Counti 
ooperative Plane Geometry Test, Revised Series, was admimster 
to the same group to measure achievement. The data indicate t E 
the general factor, G, is a better forecaster for plane geometry t ped 
the orthogonal space factor. Further analysis of the data ideta 
that the general factor is a better predicter of scholastic success th 


is the IQ. Harold Mosak. 


Kilby, Richard W, “Relation of Iowa Silent Reading Test Sora 
to Measures of Scholastic Aptitude and Achievement.” Jour 
of Applied Psychology, XXX (1946), 399—405. x ou and 

Correlations were run between the Jowa Silent Reading Т ssh- 
final grades and various aptitude measures of one hundred Yale ‘fina 
men. In general the LS.R. Test correlated positively with The 

grades, some of the correlations being statistically significant, p. 

degree of correlation varied considerably according to the 1.5. 

test and the school subject. It was found that the 1.5.1 riables 

possessed an independent relation to final grades when other а ай i 

were partialled out, and that it measured something other t? 

measured by various aptitude tests, Leroy S. Burwen. 


. 1 te 
Krawiec, I.S. Ay Comparison of Learning and Retention of Ма 
rials Presented Visually and Auditorially.” Journal o. d Р 
Psychology, XXXIV (1946), 179195. А d audi 

п experimental study of the relative merits of visual ап verb? 

tory modes of presentation for the learning and retention 0! "pic 
material, Consisting of lists of nonsense syllables and on of 
nouns. Learning was by the anticipation method with a crit rect 
two consecutive errorless trials. Retention was measure visu? 


. . = 5 
scores and the relearning and savings scores. This study show 


2 


MEASUREMENT ABSTRACTS 543 


Presentation as superior for learning both nonsense syllables and 
nouns, but for retention neither mode of presentation was consis- 
tently superior, though a slight trend toward the superiority of audi- 
tory presentation was found. Trene P. Robinson. 


Lasaga y Travieso, Jose I., in collaboration with Carlos Martinez- 
Arango. “Some Suggestions Concerning the Administration and 
Interpretation of the T.A.T." Journal of Psychology, XX (1946), 
117-163. 

This article makes detailed suggestions, supplemented by case 
Study material, concerning modifications in the technique of adminis- 
tering and evaluating the results from the Thematic Apperception 

est elaborated by Murray and Morgan of Harvard. These involve 

Selection of the pictures, manner of making up the stories, study of 

the sources of the patient’s stories, and the interview which occurs 

alter the pictures have been shown and analyzed. Certain new tech- 
niques are also mentioned, namely, the study of reaction time, of 
rejected ideas, and failures to invent stories or interpret the pictures 
9n the patient's part; a means of facilitating the analysis of the 
Stories; and due consideration for the symbolism of unconscious origin 
Which may appear. Vernon S. Tracht. 


Lefford, Arthur. “The Influence of Emotional Subject Matter on 
Logical Reasoning.” Journal of General Psychology, XXXIV 
(1946), 127-151. і 
.^* group of 186 college students were given a questionnaire of 

Paired syllogisms, consisting of two groups of 20 each, equated as 

to structure and length but differing in content, that of one syllogism 

or cach pair being socially controversial in nature, and that of the 

Other, neutral, The syllogisms were judged for validity and truth. 

;*eSults obtained from the validity judgments indicate that most sub- 

Jects solve neutrally-toned syllogisms more correctly than emotion- 

ally-toned Syllogisms. Distributions of the partiality (True-Untrue) 

Scores tend to show that reasoning is influenced both by attitudes and 

cliefs and by previous knowledge of the truth or falsity of conclu- 

Sions, Analysis of data by means of a corrected correlation ratio 

shows little relationship between ability to reason accurately in non- 

“motional and in affective situations. Frances Smith. 


Lough, Orpha M. “Teachers College Students and the Minnesota 
ultiphasic Personality Inventory.” Journal of Applied Psychol- 

ову, XXX (1946), 241-247. 
The Minnesota Multiphasic Personality Inventory was given to 
8 unmarried women students at a state teachers college to deter- 
Рале (1) whether significant differences existed on any of the scales 
Stween those taking music and those taking the general curriculum; 


Whether ijj Inventory ould he of selective value in admitting 
e ig A IP CHI WE igi qid (4) whether it indicates 
rs pel N C 


br, Se с ti 


544 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


in these students the kinds of maladjustments attributed to teachers 
in various studies. Results showed the entire group to be relatively 
stable, with no reliable differences between those in one field of study 
or the other. Further research is needed before definite conclusions 
can be reached on the other points. Vernon S. Tracht. 


соне “Validity of the Hunt-Minnesota Test for 
rain Damage.” Journal of Applied Psychology; 2 
(1946), 271-275. f Applied Рау, 
to 64 E imag pic Test for Organic Brain Damage was ap 
547 mployees of the Norwich State Hospital, with the result that 
din The were found to have scores indicating organic brain 
of aes г ПЕВе results are opposed to Hunt’s findings of 9.8 per cent 
the [ases scores among normal subjects. While it was found that 
arn Jorwich Hospital results included cases with very high vocabu- 
did uem and cases given only the short form of the test, these dit 
tudin ven for the discrepancy, and it is concluded that the na 

L revalidation on both normal : йд "ects. France 
Smith. normal and organic subjec 


plied 


> Psychologt- 


M and preset 
. FEM Li i 
nowledge in the field of opinion and attitude measurement. r 

est”, 


] m 


res i i justificati 

spect to both the theoretical and experimental justification о oth 
i is faced Wy) 

| гасу 0 "i 

questions are clearness, stability of the frame of reference; the солоп 


ttitu М 
roblems * 

f the study 
DB H а” e . n 
e a in public opinion are outlined. The underst E Pis т 
E ы among attitudes and opinions may be advance ‘uch 8 
aor methods, but the applicability of these techniques т емей" 

ad and diverse field seems limited. Studies of morale аге m^ 


Bibliography of 133 references. Frederick Gehlmann. 


Mote, Marjorie E. “The Evaluation of Certain Factori г тасу 

елате Success of Students Entering the College of ha Још“ 
md P LY of Minnesota from 1933 Through 

In thi a enia Education, XIV (1946), 207-224. e rec 

endum s study, numerous data from high-school and colleg ota СО 

2 й gne of tests of students at the University © Minnes ysis 1n 

BELO armacy were subjected to extensive statistica TA facto? 


a i : 
n effort to predict the success of entering students. ert 


P 


—— -— 
- 


MEASUREMENT ABSTRACTS 545 


were found to be valuable for predicting success, and a number of 
prediction formulas were derived. Leroy S. Burwen. 


Peel, E. A. *A New Method for Analyzing Aesthetic Preferences: 
Some Theoretical Considerations.” Psychometrika, XI (1946), 
129-137. 

The aesthetic preferences of a group of persons are obtained from 
their orders of sets of pictures and patterns according to “liking.” 
The same pictures are ordered independently by a team of experts, 
according to certain artistic criteria such as naturalism, composition, 
color, rhythm, ete. The orders of preference and orders according to 
the criteria are compared by correlation and matrices of correlation 
formed from (1) correlations between the persons’ orders of prefer- 
ence; (2) correlations between the orders of preference and orders 
according to artistic criteria; and (3) correlations between the cri- 
terion orders. These matrices are symbolized by Rp Ro, and Ro, 


respectively, and combined to form a single matrix 


pRo 
К/К, 
Three interesting analyses of this matrix are suggested: analysis of 
the whole matrix into its factors and rotation of the factors about 
the criteria, regression estimates of individual preferences on the 
artistic criteria, and regression estimates of the person preference fac- 
tors on the same criteria. Theoretical conditions and consequences 
of these analyses are then discussed by the use of matrix notation. 


Courtesy Psychometrika.) 


Peixotto, Helen E. “The Relationship of College Board Examina- 
tion Scores and Reading Scores for College Freshmen.” Journal 

of Applied Psychology, XXX (1946), 406-411. 

Scores of 263 students on the College Board Examinations and 
the Cooperative English Test C2, Reading Comprehension, were 
investigated. Intercorrelations of scores on the verbal Scholastic 
Aptitude Test, the English Essay Test, and the reading test were 
computed. All correlations were significant at the one per cent level. 
It was concludéd that reading efficiency is an important factor in 
Scores on the verbal Scholastic Aptitude Test so that the latter might 

€ used as a preliminary screening device for remedial reading. Also 
results showed that a remedial reading program would have little 
effect on courses in English Composition. Leroy S. Burwen. 


Rohde, Amanda R. “Explorations in Personality by the Sentence 
Completion Method." Journal of Applied Psychology, XXX 
(1946), 169-181. 

A projective technique type o 
the author, based upon a revision an 
Completion test, and employing respo 
Sentence beginnings after the manner 0 


f personality study is desecribed by 
d extension of Payne's sentence 
nses to carefully formulated 
f free association. The aim 


546 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


in devising this instrument was to make available to schools and other 
institutions a simply administered and interpreted projective metho 

adapted to large numbers of individuals. [Experimental validation 
was done on 670 ninth-grade students from several different high 
schools, this adolescent age level being considered likely to revea 
personal problems of adjustment. Correlation coefficients were done 
between ratings of the student's responses and those from the com 


bined judgments of teachers, counselors and others. Vernon 
Tracht. 


rd Tests 


n “ s ` d 
Sartain, A. I. “Relation Between Scores on Certain Standa - 


and Supervisory Success in an Aircraft Factory." Journa 
Applied Psychology, XXX (1946), 328-332. А 
. The following tests were administered to forty men in 
visory positions at an aircraft factory: Otis Self-Administering 
of Mental Ability (Higher Examination); Tiffin and Lawshe Ad 
bility Test (Form A); Revised Minnesota Paper Form Board; 
nett Test of Mechanical Comprehension (Form ДА); Remmers 
File How Supervise? Test (Experimental Edition, Form А); Berm 
reuter Personality Inventory; and Kuder Preference Record. par nn 
scales checked for reliability and validity were used as the quem 
of success. Correlation of these with test scores was statistics 
insignificant, thus indicating that these tests had little or no oe 
tive value for success in supervision in this plant. Leroy a 


super- 
Test 


Votaw, David F. “Regression Lines for Estimating Intelligent 
Quotients and American Council Examination Scores.” Jour 
of Educational Psychology, XXXVII (1946), 179-181. 
This illustrates the method of predicting a student's score ° n 
American Council Psychological Examination from his score m his 
previously given IQ test, and conversely estimating his IQ тоте 
score on the ACE. The writer gave the Otis Group Intel" che 
to 70 junior high-school students, following 6 years later Wi chis 
ACE when these same subjects entered college. The resu 5 | com 
study are used to demonstrate by textual explanation ат то" 


panying chart the procedure in reading regression lines. 
Tracht. 


on the 


" _—— son 
Wimberley, Stan E. “A Systematic Error in Kuhlmann-AP dE VIT 
Mental Ages? Journal of Educational Psychology 

(1946), 161-170. c clinic?! 
Analysis of the data from 77 school children and from s ders?" 
subjects indicated that measurements by the Kuhlmann-^ 7 
Tests were producing inconsistencies, i.e., tests of too great se. 
yielded М.А. and IQ's too high, while those too easy B? al exp! 
sponding values too low. Rather than accept the motivation” tion 0 
nation of this discrepancy, the author shows that standard о о 
the scale оп the basis of the wrong regression line Que will b 


chronological age) is responsible, and hopes that a mea 


MEASUREMENT ABSTRACTS 547 


found of correcting this error in these otherwise generally excellent 
tests. Vernon S. Tracht. 


Wittenborn, J. R. “Correlates of Handedness Among College Fresh- 
men.” Journal of Educational Psychology, XXXVII (1946), 
161-170. А 
‘To determine whether any relationship exists between language 

facility and cerebral dominance as it is manifest in handedness, and 
whether left handedness is a handicap, a Yale freshman class was 
divided into four groups on the basis of questionnaire responses as to 
their manual preferences. These groups in turn were compared with 
each other on self-ratings in reading, spelling, writing and speech; and 
On test scores in reading rate and comprehension, scholastic aptitude, 
English essay, mathematical aptitude, spatial visualization, and 
verbal and quantitative reasoning. Results indicate that handedness, 
either ambidextrous or undetermined, has negligible if any signifi- 
cance for language facility, although there is some evidence that left 
handedness may result in a slight handicap, principally in mathe- 
matical ability. Vernon S. Tracht. 


ADDITIONAL ARTICLES NOT ABSTRACTED 


Bernreuter, Robert G. and Jackson, Theodore A. *Sales Personnel 
Selection and Related Services." Journal of Consulting Psychol- 
ogy, X (1946), 127-130. . P 

Bradford, Е. J. б. “Selection for Technical Education. Part II. 
British. Journal of Educational Psychology, XVI (1946), 69-81. 

Buck, John N. “The Time Appreciation Test." Journal of Applied 
Psychology, XXX (1946), 388-398. 

ohen, Leonard and Strauss, Leonard. “Time Study and the Funda- 
mental Nature of Manual Skill.” Journal of Consulting Psychol- 

D ову, X (1946), 146-153. | LE Р 

Yer, Henry S. “The Validity of Certain Objective Techniques for 
easuring the Ability to Translate German into English.” Jour- 
nal of Educational Psychology, XXXVII (1946), 171-178. 
"agleson, Oran W. “Students’ Reactions to Their Given-Names. 
Journal of Social Psychology, XXIII (1946), 187-195. 
€stinger, Leon. “The Significance of Difference Between Means 
Without Reference to the Frequency Distribution Function. 
А homtrika, ХІ (1946), 97-105. , 
Fis en M n Ч 1 ation of a Test of Hand Strength. 
; M. Bruce. “Standardizat XN" (1946), 380-387. 


Journal of Applied Psychology, X 
Herfindahl, TE «Methods for Direct Reading of Standard 


Cores on an Electric Scoring Machine." Journal of Educational 


Psychology, XXXVII (1946), 234-241. 
Hi Do» Arnold H. *A Eee ed Succession Chart." Journal of 
. Psychology, XX (1946), 53-58. 
Himmelweit. HL. Hie eed and Accuracy of Work as Related to 


emperament.” British Journal of Psychology, General Section, 


XXXVI (1946), 132-144. 


548 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Humm, Doncaster С. “Test Validation on Remote Criteria.” Jour- 
nal of Applied Psychology, XXX (1946), 333-339. MEME 
Israeli, Nathan. “Studies in Occupational Analysis: II. Originality. 
Journal of Psychology, XX (1946), 77-87. : 
Krueger, William C. Е. “Касе of Progress as Related to Difficulty 
of Assignment.” Journal of Educational Psychology, XXXVIL 
(1946), 247-249. н 
Levinson, Daniel J. “A Note on the Similarities and Differences 
Between Projective Tests and Ability Tests.” Psychological Re- 
view, LIII (1946), 189-194. 
Luchins, A. S. “On Certain Misuses of the Wechsler-Bellevut 
Scales" Journal of Consulting Psychology, X. (1946), 109-111. 
Luchins, A. S. and Luchins, E. H. “Towards Intrinsic Methods v 
bord Journal of Educational Psychology, XXXVII (1946); 
Meehl, Paul E. and Jeffrey, Mary. “The Hunt-Minnesota Test е: 
шр Beale Damage in Cases of Functional oo Још 
nal of Applied Psychology, ХХХ (1946), 276-287. . 
Mellenbruch, P. L. “A a oo om on the Miami-Oxford 
Curve-Block Series.” Journal of Applied Psychology, 4 
(1946), 129-134. i est 
Morton, John A. “A Study of Children’s Mathematical Interes 
Questions as a Clue to Grade Placement of Arithmetic PT 
Journal of Educational Psychology, XXXVII. (1946), 293- rds 
Moreton, Frank E. “Attitudes of Teachers and Scholars Гомати 
Co-Education.” British Journal of Educational Psychology: 
(1946), 82-95. oss 
Murray, Henry A. and MacKinnon, Donald. “Assessment of ^30. 
Personnel.” Journal of Consulting Psychology, X (1946), 7 rm- 
Myklebust, Helmer R. "Significance of Etiology in Motor Репо» 
ance of Deaf Children with Special Reference to Mening! 
, American Journal of Psychology, LIX (1946), 249-258. S n 
Nixon, H. K. "Internal Evidence of Validity of a Rating 1 
Journal of Psychology, XX (1946), 97-115. jin; 
Rose, Florence C. and Rostas, Steven M. “The Effect of Шу?” 
tion on Reading Rate and Comprehension of College 92. 
Journal of Educational Psychology, XXXVII (1946); 2727 Work 
Rothe, Harold F. “Output Rates Among Butter- Wrappers: ‘nology? 
Curves and Their Stability.” Journal of Applied Payer 
XXX (1946), 199-211. 1. Е 
Rothe, Harold Е. “Output Rates Among Butter- Wrappers: l estrit” 
quency Distributions and an Hypothesis Regarding the (1946), 
2 jag pur "^ Journal of Applied Psychology, XXX 
327. te 
Sanford, R. Nevitt. "Age as a Factor in the Recall of InterruP 1 
Tasks.” Psychological Review, LIII (1946), 234-240. Још" 
Sartain, A. I. “Predicting Success in a School of Nursing 
of Applied Psychology, XXX. (1946), 234-240. 


Д 


ws. ~ 


— S ———— 
^. m -———— 


MEASUREMENT ABSTRACTS 549 


Shneidman, Edwin S. “A Short Method of Scoring the Minnesota 
Multiphasic Personality Inventory." Journal of Consulting Psy- 
chology, X (1946), 143-145. 

hurstone, L. L. “A Single Plane Method of Rotation.” Psycho- 
metrika, XI (1946), 71-79. 

Tsao, Fei. “General Solution of the Analysis of Variance and Covari- 
ance in the Case of Unequal or Disproportionate Numbers of 
беи їп the Subclasses.” Psychometrika, XI (1946), 
107-128. 

Turnbull, William W. “A Normalized Graphic Method of Item 
fugis" Journal of Educational Psychology, XXXVII (1946), 

9-141. 

Tyler, Leona Е. “An Exploratory Study of Discrimination of Com- 
Piper Style.” Journal of General Psychology, XXXIV (1946), 
153-163. 

Weitz, Robert D. “The Occupational Adjustment Characteristics of 
a Group of Sexually Promiscuous and Venereally Infected Fe- 
males.” Journal of Applied Psychology, XXX (1946), 248-254. 

Wells, F. L. and Woods, W. L. “Outstanding Traits: In a Selected 
College Group, with Some Reference to Career Interests and War 
Records.” Genetic Psychology Monographs, XXXIII (1946), 

—249. 
esman, Alexander С. “The Usefulness of Correctly Spelled Words 


in a Spelling Test.” Journal of Educational Psychology, XXXVII 
(1946), 242-246, 


NEW TESTS* 


Advanced Perception of Relations Scales, by Lindsey R. Harmon and 
M. J. Van Wagenen, 1946. These scales are designed to measure 
ability to perceive abstract relationships presented verbally. No 
time limit; requires about thirty minutes. Machine scorable. 
Package of 25 with directions and scoring key, $1.00. Published 
by Educational Test Bureau. 


Basic Skills in Arithmetic Test, by W. L. Wrinkle, J. Sanders, and 
E. Kendel, 1945. This test is designed as a measure of the funda- 
mental skills in arithmetic. The problems involve whole num- 
bers, fractions, decimals, and percentages. A diagnostic record 
sheet for identifying individual and class deficiencies accompanies 
the tests. Range: junior and senior high school. Package of 


25, $2.35; specimen set, 50¢. Published by Science Research 
Associates. 


Cancellation Test, by John R. Roberts, 1946. The test consists of 
ines of mixed letters in which specified letters are to be crossed 
out. Constructed for use in the selection of visual inspectors. 


Time: 10 minutes. Package of 25, 75¢, scoring key, 806. Pub- 
lished by Educational Test Bureau. 


California Test of Mental Maturity, 1946 revision, by Elizabeth T. 
Sullivan, Willis W. Clark and Ernest W. Tiegs. Available in 
five levels: Preprimary, Primary, Elementary, Intermediate and 
Advanced. $1.75 per package of 25 tests with manual of direc- 
tions and scoring key. Published by California Test Bureau. 


Clerical Perception Test, by G. Bernard Baldwin, 1946. A 15-minute 
Speed test. One form. Measures ability to perceive rapidly the 
minute details in verbal and numerical material. Package of 25 
with directions, 75¢. Published by Educational Test Bureau. ` 


Coordinated Scales of Attainment, b i 

‹ ‚ by James A. Fitzgerald, Dora V. 
Smith, M. J. Van Wagenen, U. W. Leavell, Edgar B. W 
M. E. Branom, L. J. Brueckner, Ellen Frogner, Victor C. S 
and August Dvorak. The scales consist of a separate batte 

—. fach grade, the first through the eighth. Elementary Bat 


esley, 
mith, 
ry for 
teries. 
* "n H > 

бы t addresses of the publishers of the tests listed are given at the end of the 


551 


552 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Test booklet (re-usable) each battery, package of 25 with direc- 
tions, $2.50; pupil answer sheet booklet, for either hand or ma- 
chine scoring and containing individual profile chart, package of 
25, $1.25; test marker pencils, each, 5¢; scoring keys, per set, 2 . 
Primary Batteries. Test booklet, not re-usable, with directions 
and hand scoring keys in each order: Battery 3, Grade III, per 
package of 25, $1.75; Battery 2, Grade П, per package of 25, 
$1.50; Battery 1, Grade I, per package of 25, $1.50; complete 
manual (one or more supplied with each order ) if ordered sepa- 
rately, 50¢; specimen set, consisting of complete manual, one 
battery 8 booklet, one battery 3 booklet, directions, pupil answer 
booklet, class record, tabulation sheet, and sample scoring Key» 


75¢. Published by Educational Test Bureau. 


Examining for Aphasia, by Jon Eisenson, 1946. The materials used 


for the examination of aphasia and related disturbances consist 
of common objects as well as test materials bound in the manual. 
Manual, $2.00; package of 50 record forms, $3.50; manual anc 
package of 50 record forms, $5.00. Published by The Psycho- 
logical Corporation. | 


Gates Reading Diagnostic Tests, revised edition, by Arthur T. Gates 


Gregory Academic Interest Inventory, by W. S. Gregory: 


1945. А reading diagnosis battery for use with individual pupils 
having specific reading disabilities. For use with all grades. 
Manes ADR. var of either Form I or Form II, 50¢ each; mm 
wo cards, 104; pupil's record booklet, 204 each; specimen 5^7 
$1.50. Published by Bureau of Publications, Teachers College: 
Columbia University. 


This 
ob jectivelY 


inventory "was developed to provide a means of e 
measuring and comparing students’ interests in the various ee 
partmental curricula of colleges and universities.” Stencils 3s 
available for twenty-eight areas of specialization. No time et! 
the test is usually completed within an hour. Test bookle le 
$2.50 per 25 copies; $4.75 per 50 copies; $9 per 100 copies: Mhe 
copies, 10¢. Answer sheets: 3¢ each; 500 to 1000 at ee ch; 
count; 1000 up to 20% discount. Scoring stencils: 90€ 58 20 
2-9, 756 each; 10 or more 654 each; complete set О , s | 

(specify whether to be used for hand-scoring or for ma of 
scoring). Manuals: 10¢ each. Profile charts 1¢ each. e 

scoring weights, 10¢. Published by The Sheridan Supply i 


А аар СА ee . nce 
Hand-Tool Dexterity Test, by George K. Bennett. This perfor ing 


test in the use of the wrench and screwdriver involves T he р! 
nuts, washers, and bolts from one upright according (0 tight 
scribed sequence and reassembling them on anot ег “p The 
For use with applicants for mechanical work or training 99% 
score is the time required, which is less than 11} minutes j 


— 


у NEW TESTS 553 


of male factory-workers. Complete apparatus with manual, 


$15.00; manual alone, 20¢. Published by The Psychological 
Corporation. 


Oral Directions Test, by Charles R. Langmuir. This oral test of 
general mental ability is administered by means of phonograph 
records. The subjects indicate answers on a two-page answer 
sheet. For adults. Time: 28 minutes. Album consisting of one 
16-inch record (ODT-Transcription Record) to be played on 
each side at 334 rpm, with manual, 100 copies of answer sheet 
and key, $15.00. Album consisting of four 12-inch records 
| (ODT-Standard Records, seven sides) to be played at 78 rpm, 

with manual, 100 copies of answer sheet, and key, $17.00. Extra 
answer sheets sold in packages of 100: 1 to 9 packages, $4.00 
each; 10 or more packages, $3.50 each. Plastic-covered key, 
$1.00; paper key, 15é. Manual alone, 25¢. Published by The 
Psychological Corporation. 


Oseretsky Tests of Motor Proficiency, a translation edited by Edgar 
A. Doll, 1946. Time: 20-30 minutes. Age-range, four years to 
maturity. This scale is an individual test originally produced 
by Dr. №. Oseretsky in Russia in 1923. It is designed to measure 
motor maturation in terms of age-equivalents which are ex- 
pressed as “motor age." Manual and scale, $1.00; individual 


record sheet, package of 25, $1.25. Published by Educational 
Test Bureau. 


Progressive Tests in Social and Related Sciences, b 
Georgia Sachs Adams, 1946. Now available for the elementary 
evel. Package of 25 with manual and scoring directions: Parts 
I and II, $1.25 per package; Part ПІ, $1.00 per package. Speci- 
men set, 25¢. Published by California Test Bureau. 


y John Sexson and 


EE 


Seashore-Bennett Stenographic Proficiency Test, by Harold G. Sea- 
shore and George К. Bennett, 1946. Each of the two forms of 
this test consists of two 12-inch phonograph records to be played 
on both sides. Five letters are recorded in each form: two are 
short and slow; two are medium in length and average in speed; 
опе is long and rapid. The instructions and the dictation of the 
five letters of one form require about twenty minutes. Tran- 

| scription requires from thirty minutes to an hour. Album of 
orms B-1 and B-2 (four records) with manual and 100 sum- 


mary charts, $15.00. Additional summarv charts, $2.00 for 100. 
| Manual, alone, 50¢. Published by The Psychological Corpora- 
: lon. 


Tests of Human Growth and Develo 
aurice E. Tro 
undergraduate 


pment, by John Horrocks and 
yer, 1946. These tests are intended for use in the 
and in-service professional training of teachers. 


554 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The battery consists of the following: (1) The Knowledge of 
Facts and Principles includes items on facts and concepts about 
human growth and development. Package of 25 including key, 
$1.75; five or more packages, $1.50 each; extra keys, one cent 
each. (2) The Case of Barry Black centers mainly in conflict, 
frustration, and insecurity in social situations. Package of 
including key, $2.50; five or more packages, $2.25 each; answer 
sheets, $1.00 per 100; extra keys, one cent each. (3) The Case 
of Connie Casey centers mainly in physical and economic factors. 
Prices same as for The Case of Barry Black. (4) The Case of 
Sam Smith centers in intellectual and academic factors. Prices 
same as for The Case of Barry Black. Sample set including а 
copy of each test with keys, 606. Instructor's Guide, 156. Pub- 
lished by Syracuse University Press. 


Tests of Primary Mental Abilities for Ages 5 and 6, by Thelma Gwini 


Unit Scales of Aptitude, Forms 4 MA and 4 MB, by M. J. Van 


Vineland Social Maturity Scale, by Edgar A. Doll, 1946. Ti 


Vocational Aptitude Examination, by Glenn U. Cleeton, 1946. tion 


Thurstone and L. L. Thurstone, 1946. These tests are esigne 
to measure the five basic aptitudes for learning which have thus 
far been isolated in young children: Verbal-Meaning, Quantita- 
tive, Space, Perceptual-Speed, and Motor. Time: about one 
hour. Package of 25, $2.35; specimen set, 506. Published by 
Science Research Associates. 


Wagenen. A rearrangement of Division 4 in the Unit Scales o 


Aptitude. Composed of three verbal tests: Reading, Voca? r 
lary, Composition Vocabulary, and Perception of Relations. pe 
grades 9 through 12 and adults. Separate answer sheet 
hand and machine scorable. Booklets, package of 5, О e 
answer sheets, package of 50, 506; specimen set, 29: by 
manual and scoring key included with each order. Published 
Educational Test Bureau. 


30 minutes. Range: birth to maturity. The items are amt elle 
in order of increasing average difficulty in eight categories: v s. 
Help (general, eating, dressing), Locomotion, Occupation ^, jn 
munication, Self-Direction, and Socialization. Standards oa to 
age-equivalents. Range, birth to maturity. Time, twrr's.00- 
thirty minutes. Manual: paper bound, 80¢, cloth bound, pks» 
School discount on manual, 25%. Individual recor Educ?” 
package of 25, $1.00, specimen set, $1.25. Published by 

tional Test Bureau. 

The 
al 
test is designed to measure aptitude for the following m 
areas: (1) sales, (2) scientific-technical, (3) account? for use 
(4) executive and business management. lt is inten £ 


" 


NEW TESTS 555 


in high school and college. Time: about 75 minutes. 10¢ each 
for 1 to 5 copies, 73¢ each for 5 to 100 copies, $5.00 per hundred; 
specimen set, 40¢. Published by McKnight and McKnight. 


The Wechsler-Bellevue Intelligence Scale, Form II, by David 
Wechsler, 1946. Test materials, including 25 record blanks and 
manual for Form II, $11.25; manual alone, $1.75. Package of 
25 record blanks, 90¢; package of 100, $3.25; 10 or more packages 
of 100, $3.00 each. Published by The Psychological Corporation. 


ADDRESSES OF THE PUBLISHERS OF THE 
TESTS LISTED 


Bureau of Publications, Teachers College, Columbia University, New 
York 27, New York. 

California Test Bureau, 5916 Hollywood Boulevard, Los Angeles, 
California. 

Educational Test Bureau, Minneapolis, Minnesota; Nashville, Ten- 
nessee; and Philadelphia, Pennsylvania. 

McKnight and McKnight, 109-111 West Market Street, Blooming- 
ton, Illinois. 

Psychological Corporation, 522 Fifth Avenue, New York 18, New 

ork 


ш Research Associates, 228 South Wabash Avenue, Chicago 4, 
linois. 
Sheridan Supply Company, P.O. Box 837, Beverly Hills, California. 
di ^ University Press, 920 Irving Avenue, Syracuse 10, New 
ork. 


THE CONTRIBUTORS 


Hubert E. Brogden—Ph.D., University of Illinois, 1939. Instruc- 
tor in Psychology, Ohio State University, 1939-1940. Statistician, 
U. S. Public Health Service, 1940-1942. Employed by Personnel 
Research Section, Adjutant General's Office of the War Department, 
1943-1946. Author of articles in Psychometrika, Journal of Educa- 
tional Psychology, Psychological Monographs, Journal of General 
Psychology. " Ё 


William Р. Chase—Ph.D., University of Minnesota, 1935. In- 
structor in Psychology, Dartmouth College, 1930-1932. Research 
Assistant, University of Minnesota, 1932-1935. Instructor in Psy- 
chology, University of Alabama, 1935-1937. Assistant Professor of 
Psychology, The Woman's College of the University of North Caro- 
lina, 1937-1942. Officer in U. S. Army: Personnel Consultant, Fort 
Bragg, N. C., Director of Instruction, Separation Classification School, 
Fort Dix, N. J., and Separation Classification and Counseling Officer, 
Headquarters, Second Service Command, Governors Island, N. Y., 
1942-1946. Vocational Advisement Supervisor, Vocational Advise- 
ment and Guidance Service, Veterans Administration, 1946-. Author 
of articles on learning and studies of attitude. Associate Member, 
American Psychological Association. 


Lee J. Cronbach—Ph.D., University of Chicago, 1940. Instruc- 
tor, Assistant Professor of Psychology, State College of Washington, 
1940-1946. Associate Psychologist, University of California Division 
of War Research, 1944-1945, Assistant Professor of Education, Uni- 
versity of Chicago, 1946-. Author of articles on test construction, 
Statistics, morale and learning. Associate Member, American Psy- 
chological Association. 


John C. Flanagan—Ph.D., Harvard University, 1934. Teacher 
of science, Renton, Wash., 1929-1930. Teacher of mathematics and 
athletic coach, Cleveland High School, Seattle, Wash., 1930-1932. 
Assistant in Education, Graduate School of Education, Harvard Uni- 
versity, 1934-1935. Associate Director, Cooperative Test Service, 

тегісап Council on Education, 1935-1941. Lecturer, Teachers’ 

ollege, Columbia University, 1936-1941. Chief, Psychological 

Branch, Air Surgeon's Office, U. S. Army, and Director, Aviation 

Sychology Program, Army Air Forces, 1941-1946. Professor of 

Sychology, University of Pittsburgh, 1946-. Author of tests of 

Scholastic achievement, interests, personality and aptitude; mono- 
557 


558 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


graphs and articles regarding test construction, statistical method, 
and social psychology. Author of summary report on research in 
Aviation Psychology in the Army Air Forces. Member, American 
Psychological Association, Society for the Advancement of Ed ucation, 
American Educational Research Association, New York Academy, 
American Statistical Association, Psychometric Society, American 
Association for the Advancement of Science. 


J. P. Guilford—Ph.D., Cornell University, 1927. Instructor in 
Psychology, University of Illinois, 1926-1927. Assistant Professor 
of Psychology, University of Kansas, 1927-1928. Associate Professor 
and Professor of Psychology, University of Nebraska, 1928-1940. 
Director, Bureau of Instructional Research, University of Nebraska, 
1938-1940. Professor of Psychology, University of Southern Cali- 
fornia, 1940-1942. In U. S. Army: Director, Psychological Research 

- 5, Santa Ana Army Air Base, and Psychological Unit No. 2, 


American Psychological A 


William Leroy Jenkins— Ph.D. 
nstructor, Assistant Professor, Lehigh University. 1935-1943. Re- 
Search Associate, University of California Division of War Research, 
1943-1944. Supervisor, Training Aids, Columbia University И 
sion of War Research, Submarine Training Section, 1944-1945. 
Associate Professor of Psychology, Lehigh University, 1946-. Author 
of articles on cutaneous Sensitivity. Member American Psycholog'- 
cal Association d 


. Joseph E, King—Ph.D., University of Chicago, 1946. Lecturer 
in + sychology, Loyola University, 1939-1945. Clinician, University 
A Chicago Laboratory Schools, {940 9012 АСД Psychologist, 
Army Air Forces, 1942-1946, Test Editor, Science Research a ad 
ates, 1946- Member, American Psychological Association, Psy¢ 
metric Society, 


John M. Stalnaker- М.А. Universit of Chicago, 1928. Purdue 
niversity, 1926-1931, University of js folie 1931-1936. a 
ntrance Examination Board, 1936-1945. Princeton Univer ed 

1936-1945. ean of Students and Professor of Psychology; an 

University, 1945— Director and Secretary-Treasurer, Pepsi- De- 
cholarship Board, 1945... Consultant, Navy Department alif ing 

partment of State. General Director, Army-Navy College Qualify 


THE CONTRIBUTORS 559 


Test Program, during the war. Member, American Psychological 
Association, Psychometric Society, American Statistical Association, 
National Education Association, Educational Research Association. 


Ruth C. Stalnaker—B.A., Smith College, 1929. Secretary and 
Research Assistant, Board of Examinations, University of Chicago, 
1931-1936. Research Assistant, College Entrance Examination 


Board, 1936-1945. Research Associate, Pepsi-Cola Scholarship 
Board, 1945-. 


Erwin K. Taylor—Ph.D., Northwestern University, 1941. Per- 
sonnel Examiner, Illinois State Civil Service Commission, 1942-1943. 
Personnel Technician, Personnel Research Section, Adjutant Gen- 
eral's Office, 1943-1945. Chief, Statistical Analysis Unit. Personnel 
Research, Adjustant General's Office, 1945-. Fellow, American 
Psychological Association. Member, Psychometric Society, Civil 
Service Assembly of U. S. and Canada. А 


Gilbert С. Wrenn—Ph.D., Stanford University, 1932. Vocational 
Counselor, Stanford University, 1928-1936. Associate Director, Gen- 
eral College, and Associate Professor of Educational Psychology, 
University of Minnesota, 1936-1938. Professor of Educational Psy- 
chology, University of Minnesota, 1938-. On military leave 1942— 
1946, serving in the Bureau of Naval Personnel and Pacific area as 

sonnel Officer in the U. S. Army. Associate American Youth 
Commission, 1939-1941; Consultant, Student Personnel Teacher 
Edueation Commission of the American Council on Education, 1939- 
1942. President, National Vocational Guidance Association; Vice- 
President, American College Personnel Association; Vice-President, - 
Council of Guidance Personnel Association, 1946-. Author and 
co-author of Student Personnel Problems, Studying Effectively, Aids 
to Group Guidance, Time on Our Hands, and of numerous articles. 


STATEMENT OF THE OWNERSHI MANAGEME 
REQUIRED BY THE ACTS OF CONGRESS OF AUG 
9, 1933, of EDUCATIONAL AND PSYCHOLOGICAL M 
quarterly at Lancaster, Pennsylvania, for October 1, 1946. 
DISTRICT of COLUMBIA. T 
Before me, a Notary Public in and for the State and county aforesaid, personally 
appeared Frederic Kuder, who, having been duly Sworn according to law, серозна 
and says that he is the Editor of the EDUCATIONAL AND PSYCHOLOGICAL 
MEASUREMENT, and that the following is, to the best of his knowledge and belief, 
rship, management (and if n daily paper, the circula- 
) i on for the date shown in the above aption, 
required by the Act of August 24, 19 12, as amended by the Act of March S. 1933, 
em 5, printed on the reverse of this 


NT, CIRCULATION, ETC., 

F 1912 AND MARCH 
MENT, published 
F WASHINGTON, 


1. That the names and addresses of t 


he publisher, editor, managing editor, and 
business managers are: Publisher, F) 


гейегїс Kuder, 917 Fifteenth St, N.W., Wash- 
ington, D, C. Editor, Frederic Kuder, 917 Fifteenth a N. Wa Washington, D. 
Managing Editor, Frederic Kuder, 917 Fifteenth N.W., Washington, D. 
Business Manager, Maxine E. Lytle, 917 Fifteenth St Wa Washington, D. C. 
2. That the owner is : (If owned by a corporation, its name and address must be 
Stated and also immediately thereunde 


г the names and addresses of stockholders own- 

^ of total amount of stock, f not owned by a 

sses of the individual owners must be given, ` If 

Owned by a firm, company, or other unincorporated concerns, its name and address, 
pi as those of each individual member, 


must be given.) Frederic Kuder, 917 

„ That the known bondholders, Mortgagees, and other security holders owning 
or holding 1 per cent or more of total amount of bonds, mortgages, or other securities 
are: (If there are none, so state.) None, 


. ; giving the names of the owners, stock- 
holders, nnd Security holders, if any, contain not only the list of stockholders and 
security holders as they appe y but also, in cases 
ppears upon the books of the company as 
the name of the person or corporation for 
is g ven; also that the said two paragraphs contain 
Statements embracing affiant's the circumstances and 
e d security holders Who do not appear upon the 
books of the company as trustees, hold stock and securities in a capacity other than 
that of a bona fide owner; and this affiant has no reason to believe that any Other 
as any interest direct or indirect in the said 

Stock, bonds, or other securities than as so Staten by him, 

5. That the average number of copies of each issue of this publication sold or 
distributed, through the m ‚ to paid subscribers during the twelve 
months preceding the dat : i (This information is required 
from daily, weekly, Semi-weekly, and tri-weekly publications, ) 

Signed: Frederic Kuder, йог. Sworn to and subscribed before me this 3rd 
day of October, 1946, Patrick H. McCormick, Notary Publie, D c 

Ба 3 US. 


(My commission expires July 14, 1948.) 


